|
Transtec (Test Sponsor: Helmholtz-Zentrum Dresden - Rossendorf) Hemera: Supermicro SuperServer 1029GQ-TXRT (Intel Xeon Gold 6136, Tesla P100-SXM2-16GB) |
SPEChpc 2021_tny_base = 13.7 |
|
SPEChpc 2021_tny_peak = Not Run |
| hpc2021 License: | 065A | Test Date: | Sep-2021 |
|---|---|---|---|
| Test Sponsor: | Helmholtz-Zentrum Dresden - Rossendorf | Hardware Availability: | Jul-2017 |
| Tested by: | Helmholtz-Zentrum Dresden - Rossendorf | Software Availability: | Jul-2021 |
Benchmark result graphs are available in the PDF report.
| Benchmark | Base | Peak | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Model | Ranks | Thrds/Rnk | Seconds | Ratio | Seconds | Ratio | Seconds | Ratio | Model | Ranks | Thrds/Rnk | Seconds | Ratio | Seconds | Ratio | Seconds | Ratio | |
| SPEChpc 2021_tny_base | 13.7 | |||||||||||||||||
| SPEChpc 2021_tny_peak | Not Run | |||||||||||||||||
| Results appear in the order in which they were run. Bold underlined text indicates a median measurement. | ||||||||||||||||||
| 505.lbm_t | ACC | 4 | 1 | 108 | 20.8 | 108 | 20.8 | |||||||||||
| 513.soma_t | ACC | 4 | 1 | 173 | 21.3 | 174 | 21.3 | |||||||||||
| 518.tealeaf_t | ACC | 4 | 1 | 168 | 9.80 | 168 | 9.79 | |||||||||||
| 519.clvleaf_t | ACC | 4 | 1 | 87.3 | 18.9 | 86.8 | 19.0 | |||||||||||
| 521.miniswp_t | ACC | 4 | 1 | 188 | 8.50 | 189 | 8.47 | |||||||||||
| 528.pot3d_t | ACC | 4 | 1 | 124 | 17.1 | 124 | 17.1 | |||||||||||
| 532.sph_exa_t | ACC | 4 | 1 | 551 | 3.54 | 551 | 3.54 | |||||||||||
| 534.hpgmgfv_t | ACC | 4 | 1 | 120 | 9.81 | 120 | 9.81 | |||||||||||
| 535.weather_t | ACC | 4 | 1 | 78.5 | 41.1 | 78.7 | 41.0 | |||||||||||
| Hardware Summary | |
|---|---|
| Type of System: | Homogenous Cluster |
| Compute Node: | Compute Node |
| Interconnect: | Infiniband (EDR) |
| Compute Nodes Used: | 1 |
| Total Chips: | 1 |
| Total Cores: | 12 |
| Total Threads: | 12 |
| Total Memory: | 384 GB |
| Software Summary | |
|---|---|
| Compiler: | C/C++/Fortran: Version 21.7 of NVIDIA HPC SDK for Linux |
| MPI Library: | OpenMPI Version 4.0.5 |
| Other MPI Info: | None |
| Other Software: | None |
| Base Parallel Model: | ACC |
| Base Ranks Run: | 4 |
| Base Threads Run: | 1 |
| Peak Parallel Models: | Not Run |
| Hardware | |
|---|---|
| Number of nodes: | 1 |
| Uses of the node: | compute |
| Vendor: | Intel |
| Model: | SuperServer 1029GQ-TXRT |
| CPU Name: | Intel Xeon Gold 6136 |
| CPU(s) orderable: | 1 chips |
| Chips enabled: | 1 |
| Cores enabled: | 12 |
| Cores per chip: | 12 |
| Threads per core: | 1 |
| CPU Characteristics: | Intel Turbo Boost Technology up to 3.7 GHz |
| CPU MHz: | 3000 |
| Primary Cache: | 32 KB I + 32 KB D on chip per core |
| Secondary Cache: | 1 MB I+D on chip per core |
| L3 Cache: | 25344 KB I+D on chip per chip |
| Other Cache: | None |
| Memory: | 384 GB (12 x 32GB 2Rx4 PC4-2666V-RB2-12) |
| Disk Subsystem: | 1 x 500 GB |
| Other Hardware: | None |
| Accel Count: | 4 |
| Accel Model: | Tesla P100-SXM2-16GB |
| Accel Vendor: | NVIDIA Corporation |
| Accel Type: | GPU |
| Accel Connection: | PCIe 3.0 16x |
| Accel ECC enabled: | Yes |
| Adapter: | Mellanox MT4115 |
| Number of Adapters: | 2 |
| Slot Type: | PCI-Express 3.0 x16 |
| Data Rate: | 100 Gb/s |
| Ports Used: | 2 |
| Interconnect Type: | EDR Infiniband |
| Software | |
|---|---|
| Adapter: | Mellanox MT4115 |
| Adapter Firmware: | 12.28.2006 |
| Operating System: | CentOS Linux release 7.9.2009 (Core) 3.10.0-1160.6.1.el7.x86_64 |
| Local File System: | xfs |
| Shared File System: | GPFS Version 5.0.5.0 6 NSD (vendor: NEC) 5 building blocks (vendor: NetApp): 2x (240 x 8 TB HDD) 1x (180 x 12 TB HDD) 1x (240 x 16 TB HDD) 1x (120 x 16 TB HDD) |
| System State: | Multi-user, run level 3 |
| Other Software: | None |
| Hardware | |
|---|---|
| Vendor: | Mellanox Technologies |
| Model: | Mellanox SB7790 |
| Switch Model: | 36 x EDR 100 Gb/s |
| Number of Switches: | 2 |
| Number of Ports: | 36 |
| Data Rate: | 100 Gb/s |
| Topology: | Mesh (blocking factor: 8:1) |
| Primary Use: | MPI Traffic, GPFS |
| Software |
|---|
The config file option 'submit' was used.
MPI startup command:
mpirun --bind-to socket -np $ranks $[top]/mpirunCUDA.sh $command
contents of $[top]/mpirunCUDA.sh
#!/bin/bash
export CUDA_VISIBLE_DEVICES=$OMPI_COMM_WORLD_LOCAL_RANK
$@
==============================================================================
CC 505.lbm_t(base) 513.soma_t(base) 518.tealeaf_t(base) 521.miniswp_t(base)
534.hpgmgfv_t(base)
------------------------------------------------------------------------------
nvc 21.7-0 64-bit target on x86-64 Linux -tp skylake
NVIDIA Compilers and Tools
Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
------------------------------------------------------------------------------
==============================================================================
CXXC 532.sph_exa_t(base)
------------------------------------------------------------------------------
nvc++ 21.7-0 64-bit target on x86-64 Linux -tp skylake
NVIDIA Compilers and Tools
Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
------------------------------------------------------------------------------
==============================================================================
FC 519.clvleaf_t(base) 528.pot3d_t(base) 535.weather_t(base)
------------------------------------------------------------------------------
nvfortran 21.7-0 64-bit target on x86-64 Linux -tp skylake
NVIDIA Compilers and Tools
Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
------------------------------------------------------------------------------
| 521.miniswp_t: | -DUSE_KBA -DUSE_ACCELDIR |
| 532.sph_exa_t: | -DSPEC_USE_LT_IN_KERNELS --c++17 |
| -Mfprelaxed -Mnouniform -Mstack_arrays -fast -acc=gpu -Minfo=accel -DSPEC_ACCEL_AWARE_MPI |
| -Mfprelaxed -Mnouniform -Mstack_arrays -fast -acc=gpu -Minfo=accel -DSPEC_ACCEL_AWARE_MPI |
| -DSPEC_ACCEL_AWARE_MPI -Mfprelaxed -Mnouniform -Mstack_arrays -fast -acc=gpu -Minfo=accel |