Transtec (Test Sponsor: Helmholtz-Zentrum Dresden - Rossendorf) Hemera: Supermicro SuperServer 1029GQ-TXRT (Intel Xeon Gold 6136, Tesla P100-SXM2-16GB) |
SPEChpc 2021_sml_base = 9.75 |
SPEChpc 2021_sml_peak = Not Run |
hpc2021 License: | 065A | Test Date: | Sep-2021 |
---|---|---|---|
Test Sponsor: | Helmholtz-Zentrum Dresden - Rossendorf | Hardware Availability: | Jul-2017 |
Tested by: | Helmholtz-Zentrum Dresden - Rossendorf | Software Availability: | Jul-2021 |
Benchmark result graphs are available in the PDF report.
Benchmark | Base | Peak | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model | Ranks | Thrds/Rnk | Seconds | Ratio | Seconds | Ratio | Seconds | Ratio | Model | Ranks | Thrds/Rnk | Seconds | Ratio | Seconds | Ratio | Seconds | Ratio | |
SPEChpc 2021_sml_base | 9.75 | |||||||||||||||||
SPEChpc 2021_sml_peak | Not Run | |||||||||||||||||
Results appear in the order in which they were run. Bold underlined text indicates a median measurement. | ||||||||||||||||||
605.lbm_s | ACC | 32 | 1 | 99.9 | 15.5 | 99.6 | 15.6 | |||||||||||
613.soma_s | ACC | 32 | 1 | 133 | 12.0 | 135 | 11.8 | |||||||||||
618.tealeaf_s | ACC | 32 | 1 | 453 | 4.52 | 453 | 4.52 | |||||||||||
619.clvleaf_s | ACC | 32 | 1 | 108 | 15.3 | 108 | 15.3 | |||||||||||
621.miniswp_s | ACC | 32 | 1 | 195 | 5.65 | 196 | 5.62 | |||||||||||
628.pot3d_s | ACC | 32 | 1 | 136 | 12.3 | 136 | 12.4 | |||||||||||
632.sph_exa_s | ACC | 32 | 1 | 506 | 4.55 | 506 | 4.55 | |||||||||||
634.hpgmgfv_s | ACC | 32 | 1 | 161 | 6.05 | 161 | 6.06 | |||||||||||
635.weather_s | ACC | 32 | 1 | 78.6 | 33.1 | 78.7 | 33.0 |
Hardware Summary | |
---|---|
Type of System: | Homogenous Cluster |
Compute Node: | Compute Node |
Interconnect: | Infiniband (EDR) |
Compute Nodes Used: | 8 |
Total Chips: | 8 |
Total Cores: | 96 |
Total Threads: | 96 |
Total Memory: | 3 TB |
Software Summary | |
---|---|
Compiler: | C/C++/Fortran: Version 21.7 of NVIDIA HPC SDK for Linux |
MPI Library: | OpenMPI Version 4.0.5 |
Other MPI Info: | None |
Other Software: | None |
Base Parallel Model: | ACC |
Base Ranks Run: | 32 |
Base Threads Run: | 1 |
Peak Parallel Models: | Not Run |
Hardware | |
---|---|
Number of nodes: | 8 |
Uses of the node: | compute |
Vendor: | Intel |
Model: | SuperServer 1029GQ-TXRT |
CPU Name: | Intel Xeon Gold 6136 |
CPU(s) orderable: | 1 chips |
Chips enabled: | 1 |
Cores enabled: | 12 |
Cores per chip: | 12 |
Threads per core: | 1 |
CPU Characteristics: | Intel Turbo Boost Technology up to 3.7 GHz |
CPU MHz: | 3000 |
Primary Cache: | 32 KB I + 32 KB D on chip per core |
Secondary Cache: | 1 MB I+D on chip per core |
L3 Cache: | 25344 KB I+D on chip per chip |
Other Cache: | None |
Memory: | 384 GB (12 x 32GB 2Rx4 PC4-2666V-RB2-12) |
Disk Subsystem: | 1 x 500 GB |
Other Hardware: | None |
Accel Count: | 4 |
Accel Model: | Tesla P100-SXM2-16GB |
Accel Vendor: | NVIDIA Corporation |
Accel Type: | GPU |
Accel Connection: | PCIe 3.0 16x |
Accel ECC enabled: | Yes |
Adapter: | Mellanox MT4115 |
Number of Adapters: | 2 |
Slot Type: | PCI-Express 3.0 x16 |
Data Rate: | 100 Gb/s |
Ports Used: | 2 |
Interconnect Type: | EDR Infiniband |
Software | |
---|---|
Adapter: | Mellanox MT4115 |
Adapter Firmware: | 12.28.2006 |
Operating System: | CentOS Linux release 7.9.2009 (Core) 3.10.0-1160.6.1.el7.x86_64 |
Local File System: | xfs |
Shared File System: | GPFS Version 5.0.5.0 6 NSD (vendor: NEC) 5 building blocks (vendor: NetApp): 2x (240 x 8 TB HDD) 1x (180 x 12 TB HDD) 1x (240 x 16 TB HDD) 1x (120 x 16 TB HDD) |
System State: | Multi-user, run level 3 |
Other Software: | None |
Hardware | |
---|---|
Vendor: | Mellanox Technologies |
Model: | Mellanox SB7790 |
Switch Model: | 36 x EDR 100 Gb/s |
Number of Switches: | 2 |
Number of Ports: | 36 |
Data Rate: | 100 Gb/s |
Topology: | Mesh (blocking factor: 8:1) |
Primary Use: | MPI Traffic, GPFS |
Software |
---|
The config file option 'submit' was used. MPI startup command: mpirun --bind-to socket -np $ranks $[top]/mpirunCUDA.sh $command contents of $[top]/mpirunCUDA.sh #!/bin/bash export CUDA_VISIBLE_DEVICES=$OMPI_COMM_WORLD_LOCAL_RANK $@
============================================================================== CC 605.lbm_s(base) 613.soma_s(base) 618.tealeaf_s(base) 621.miniswp_s(base) 634.hpgmgfv_s(base) ------------------------------------------------------------------------------ nvc 21.7-0 64-bit target on x86-64 Linux -tp skylake NVIDIA Compilers and Tools Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== CXXC 632.sph_exa_s(base) ------------------------------------------------------------------------------ nvc++ 21.7-0 64-bit target on x86-64 Linux -tp skylake NVIDIA Compilers and Tools Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== FC 619.clvleaf_s(base) 628.pot3d_s(base) 635.weather_s(base) ------------------------------------------------------------------------------ nvfortran 21.7-0 64-bit target on x86-64 Linux -tp skylake NVIDIA Compilers and Tools Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------
621.miniswp_s: | -DUSE_KBA -DUSE_ACCELDIR |
632.sph_exa_s: | -DSPEC_USE_LT_IN_KERNELS --c++17 |
-Mfprelaxed -Mnouniform -Mstack_arrays -fast -acc=gpu -Minfo=accel -DSPEC_ACCEL_AWARE_MPI |
-Mfprelaxed -Mnouniform -Mstack_arrays -fast -acc=gpu -Minfo=accel -DSPEC_ACCEL_AWARE_MPI |
-DSPEC_ACCEL_AWARE_MPI -Mfprelaxed -Mnouniform -Mstack_arrays -fast -acc=gpu -Minfo=accel |