HPC Cluster Integration and Benchmarking
* Octapus with Supermicro HPC Cluster Integration and Benchmarking
HPC solutions can only perform as good as their weakest link. Therefore, from cache coherence and check pointingto resource management, it is important that each hardware component is performing at optimum to ensure that the entire cluster will meet expectations. Supermicro has developed a unique testing suite that starts at the component level and builds out. This testing suite allows our engineers to catch and troubleshoot any discrepancies in performance that may be found and to make appropriate changes to ensure the overall quality of the final cluster.
The Supermicro testing suite is designed to cover the entire product from end-to-end, meaning that it encompasses both fine-grained details as well as cluster-level testing. This process reduces the amount of time and effort, allowing our product to reach the customer site ready to plug-and-play.
The Supermicro Benchmarking team is here to show off our products to you! To give our customers a gauge of the raw power our products are capable of, we offer an array of fine grained, per-component base, as well as cluster level benchmarking. The benchmarking numbers generated by our team help out our end customersin understanding the full potential of the product that they will be using and give insight into the capabilities of their cluster. Our benchmarking process is custom designed around the software and workloads that the customer will be using to operate their cluster. The process gives further assurance that the cluster will be ready to go as soon as it is unpacked and plugged in.
The list below contains our standard benchmarks, but we are happy to do others as well. If you have a benchmark that you are interested in but do not see on the list, please feel free to contact us for inquiry.
Table 1. Cluster Test
Cluster Test Suite | ||
---|---|---|
Cluster Level Test | WRF, NAMD | |
Benchmark Test | HPL, HPCC, HPCG, Igatherv, DGEMM, PTrans, FFT | |
Parallel Storage Test | IOR, MDTest | |
System Component Test | CPU | Stress-ng, CPU-Stress |
Memory | STREAM, SAT | |
Network | Netperf, Iperf, OSU benchmark | |
HDD | FIO, IOZone, DD, hdparm |
Table 2.HPC software stack
Cluster Software Stack | ||||||
---|---|---|---|---|---|---|
Deep Learning Environment | Frameworks | Caffe, Caffe2, Caffe-MPI, Chainer, Microsoft CNTK, Keras, MXNet, Tensorflow, Theano, PyTorch | ||||
Libraries | cnDNN, NCCL, cuBLAS | |||||
User Access | NVIDIA DIGITS | |||||
Programming Environment | Development & Performance Tools | Intel Parallel Studio XS Cluster Edition | PGI Cluster Development Kit | GNU Toolchain | NVIDIA CUDA | |
Scientific and Communication Libraries | Intel MPI | MVAPICH2, MVAPICH | IBM Spectrum LSF | Open MPI | ||
Debuggers | Intel IDB | PGI PGDBG | GNU GDB | |||
Schedulers, File Systems and Management | Resource Management/ |
Adaptive Computive Moab, Maui TORQUE | SLURM | Altair PBS Professional | IBM Spectrum LSF | Grid Engine |
File Systems | Lustre | NFS | GPFS | Local (ext3, ext4, XFS) | ||
Cluster Management | Beowulf, xCat, OpenHPC, Rocks, Bright Cluster Manager for HPC including support for NVIDIA Data Center GPU Manager | |||||
Operating Systems and Drivers | Drivers & Network Mgmt. | Accelerator Software Stack and Drives | OFED, OPA | |||
Operating Systems | Linux (RHEL, CentOS, SUSE Enterprise, Ubuntu, etc.) |