SPEChpc(TM) 2021 Tiny Result Lenovo Global Technology ThinkSystem SD650-N V2 (Intel Xeon Platinum 8368Q, Tesla A100-SXM-40GB) hpc2021 License: 28 Test date: Aug-2021 Test sponsor: Lenovo Global Technology Hardware availability: Aug-2021 Tested by: Lenovo Global Technology Software availability: Aug-2021 Base Base Thrds Base Base Peak Peak Thrds Peak Peak Benchmarks Model Ranks pr Rnk Run Time Ratio Model Ranks pr Rnk Run Time Ratio -------------- ------ ------ ------ --------- --------- ------ ------ ------ --------- --------- 505.lbm_t ACC 8 1 13.7 164 S 505.lbm_t ACC 8 1 13.7 165 * 505.lbm_t ACC 8 1 13.6 165 S 513.soma_t ACC 8 1 36.1 103 S 513.soma_t ACC 8 1 36.1 102 * 513.soma_t ACC 8 1 36.2 102 S 518.tealeaf_t ACC 8 1 99.2 16.6 S 518.tealeaf_t ACC 8 1 99.1 16.6 S 518.tealeaf_t ACC 8 1 99.2 16.6 * 519.clvleaf_t ACC 8 1 20.2 81.9 S 519.clvleaf_t ACC 8 1 20.0 82.5 S 519.clvleaf_t ACC 8 1 20.2 81.9 * 521.miniswp_t ACC 8 1 81.4 19.7 S 521.miniswp_t ACC 8 1 82.7 19.4 S 521.miniswp_t ACC 8 1 82.1 19.5 * 528.pot3d_t ACC 8 1 44.2 48.1 S 528.pot3d_t ACC 8 1 44.5 47.7 * 528.pot3d_t ACC 8 1 44.7 47.6 S 532.sph_exa_t ACC 8 1 73.1 26.7 * 532.sph_exa_t ACC 8 1 73.0 26.7 S 532.sph_exa_t ACC 8 1 73.3 26.6 S 534.hpgmgfv_t ACC 8 1 69.7 16.9 S 534.hpgmgfv_t ACC 8 1 69.4 16.9 * 534.hpgmgfv_t ACC 8 1 68.3 17.2 S 535.weather_t ACC 8 1 20.9 155 S 535.weather_t ACC 8 1 21.0 153 * 535.weather_t ACC 8 1 21.0 153 S ============================================================================================================ 505.lbm_t ACC 8 1 13.7 165 * 513.soma_t ACC 8 1 36.1 102 * 518.tealeaf_t ACC 8 1 99.2 16.6 * 519.clvleaf_t ACC 8 1 20.2 81.9 * 521.miniswp_t ACC 8 1 82.1 19.5 * 528.pot3d_t ACC 8 1 44.5 47.7 * 532.sph_exa_t ACC 8 1 73.1 26.7 * 534.hpgmgfv_t ACC 8 1 69.4 16.9 * 535.weather_t ACC 8 1 21.0 153 * SPEChpc 2021_tny_base 48.5 SPEChpc 2021_tny_peak Not Run BENCHMARK DETAILS ----------------- Type of System: Homogenous Cluster Compute Nodes Used: 2 Total Chips: 4 Total Cores: 152 Total Threads: 152 Total Memory: 1 TB Compiler: Nvidia HPC SDK 21.5 MPI Library: Open MPI 4.0.5 Base Parallel Model: ACC Base Ranks Run: 8 Base Threads Run: 1 Peak Parallel Models: Not Run Node Description: ThinkSystem SD650-N V2 ======================================== HARDWARE -------- Number of nodes: 2 Uses of the node: compute Vendor: Lenovo Global Technology Model: ThinkSystem SD650-N V2 CPU Name: Intel Xeon Platinum 8368Q CPU(s) orderable: 2 chips Chips enabled: 2 Cores enabled: 76 Cores per chip: 38 Threads per core: 1 CPU Characteristics: Turbo up to 3.7 GHz CPU MHz: 2600 Primary Cache: 32 KB I + 48 KB D on chip per core Secondary Cache: 1280 KB I+D on chip per core L3 Cache: 57 MB I+D on chip per chip Other Cache: None Memory: 512 GB (16 x 32 GB 2Rx8 PC4-3200A-R) Disk Subsystem: 1 x 480 GB 2.5" SSD Other Hardware: None Accel Count: 4 Accel Model: Tesla A100 SXM4 40GB Accel Vendor: Nvidia Corporation Accel Type: GPU Accel Connection: NVLink Accel ECC enabled: Yes Accel Description: Nvidia Tesla A100 SXM4 40GB Adapter: Mellanox ConnectX-6 HDR Number of Adapters: 1 Slot Type: PCI-Express 4.0 x16 Data Rate: 200 Gb/s Ports Used: 1 Interconnect Type: Nvidia Mellanox ConnectX-6 HDR SOFTWARE -------- Accelerator Driver: 460.32.03 Adapter: Mellanox ConnectX-6 HDR Adapter Driver: 5.1-2.3.7 Adapter Firmware: 20.28.1002 Operating System: Red Hat Enterprise Linux Server release 8.3, Kernel 4.18.0-193.el8.x86_64 Local File System: xfs Shared File System: NFS System State: Multi-user, run level 3 Other Software: None Node Description: ThinkSystem SD650-N V2 ======================================== HARDWARE -------- Number of nodes: 1 Uses of the node: Fileserver Vendor: Lenovo Global Technology Model: ThinkSystem SD650-N V2 CPU Name: Intel Xeon Platinum 8368Q CPU(s) orderable: 2 chips Chips enabled: 2 Cores enabled: 76 Cores per chip: 38 Threads per core: 1 CPU Characteristics: Turbo up to 3.7 GHz CPU MHz: 2600 Primary Cache: 32 KB I + 48 KB D on chip per core Secondary Cache: 1280 KB I+D on chip per core L3 Cache: 57 MB I+D on chip per chip Other Cache: None Memory: 512 GB (16 x 32 GB 2Rx8 PC4-3200A-R) Disk Subsystem: 1 x 960 GB NVME 2.5" SSD Other Hardware: None Accel Count: 4 Accel Model: Tesla A100 SXM4 40GB Accel Vendor: Nvidia Accel Type: GPU Accel Connection: Nvidia Tesla A100 SXM4 40GB Accel ECC enabled: Yes Accel Description: Nvidia Tesla A100 SXM4 40GB Adapter: Mellanox ConnectX-6 HDR Number of Adapters: 1 Slot Type: PCI-Express 4.0 x16 Data Rate: 200 Gb/s Ports Used: 1 Interconnect Type: Nvidia Mellanox ConnectX-6 HDR SOFTWARE -------- Accelerator Driver: N/A Adapter: Mellanox ConnectX-6 HDR Adapter Driver: 5.1-2.3.7 Adapter Firmware: 20.28.1002 Operating System: Red Hat Enterprise Linux Server release 8.3 Local File System: xfs Shared File System: N/A System State: Multi-User, run level 3 Other Software: None Interconnect Description: Nvidia Mellanox ConnectX-6 HDR ======================================================== HARDWARE -------- Vendor: Nvidia Model: Nvidia Mellanox ConnectX-6 HDR Switch Model: N/A Number of Switches: 0 Number of Ports: 0 Data Rate: N/A Firmware: N/A Topology: Direct Connect Primary Use: MPI Traffic, NFS Access SOFTWARE -------- Submit Notes ------------ Indiviual Ranks were bound to the CPU cores on the same NUMA node as the GPU using 'numactl' within the following "bind.pl" perl script: ---- Start bind.pl ------ my %bind; $bind{0} = "1-3"; $bind{1} = "4-7"; $bind{2} = "8-10"; $bind{3} = "11-14"; $bind{4} = "41-43"; $bind{5} = "44-47"; $bind{6} = "61-63"; $bind{7} = "64-67"; my $rank = $ENV{OMPI_COMM_WORLD_LOCAL_RANK}; my $cmd = "taskset -c $bind{$rank} "; while (my $arg = shift) { $cmd .= "$arg "; } my $rc = system($cmd); exit($rc); ---- End bind.pl ------ The config file option 'submit' was used. submit = mpirun ${MPIRUN_OPTS} --allow-run-as-root --oversubscribe -host 192.168.99.171:4,192.168.99.172:4 -x UCX_MEMTYPE_CACHE=n -mca coll_hcoll_enable 1 -x HCOLL_MAIN_IB=mlx5_0:1 -mca pml ucx -x UCX_TLS=sm,dc,rc,knem,cuda_copy,cuda_ipc -npernode 4 --map-by core -np $ranks General Notes ------------- Environment variables set by runhpc before the start of the run: UCX_MEMTYPE_CACHE = "n" UCX_TLS = "self,shm,cuda_copy" Compiler Version Notes ---------------------- ============================================================================== CC 505.lbm_t(base) 513.soma_t(base) 518.tealeaf_t(base) 521.miniswp_t(base) 534.hpgmgfv_t(base) ------------------------------------------------------------------------------ nvc 21.5-0 LLVM 64-bit target on x86-64 Linux -tp skylake NVIDIA Compilers and Tools Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== CXXC 532.sph_exa_t(base) ------------------------------------------------------------------------------ nvc++ 21.5-0 LLVM 64-bit target on x86-64 Linux -tp skylake NVIDIA Compilers and Tools Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== FC 519.clvleaf_t(base) 528.pot3d_t(base) 535.weather_t(base) ------------------------------------------------------------------------------ nvfortran 21.5-0 LLVM 64-bit target on x86-64 Linux -tp skylake NVIDIA Compilers and Tools Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. ------------------------------------------------------------------------------ Base Compiler Invocation ------------------------ C benchmarks: mpicc C++ benchmarks: mpicxx Fortran benchmarks: mpif90 Base Portability Flags ---------------------- 521.miniswp_t: -DUSE_KBA -DUSE_ACCELDIR 532.sph_exa_t: -DSPEC_USE_LT_IN_KERNELS --c++17 Base Optimization Flags ----------------------- C benchmarks: -Mfprelaxed -Mnouniform -Mstack_arrays -fast -acc=gpu -Minfo=accel -DSPEC_ACCEL_AWARE_MPI C++ benchmarks: -Mfprelaxed -Mnouniform -Mstack_arrays -fast -acc=gpu -Minfo=accel -DSPEC_ACCEL_AWARE_MPI Fortran benchmarks: -DSPEC_ACCEL_AWARE_MPI -Mfprelaxed -Mnouniform -Mstack_arrays -fast -acc=gpu -Minfo=accel Base Other Flags ---------------- C benchmarks: -w C++ benchmarks: -w Fortran benchmarks: -w The flags file that was used to format this result can be browsed at http://www.spec.org/hpc2021/flags/nv2021_flags.html You can also download the XML flags source by saving the following link: http://www.spec.org/hpc2021/flags/nv2021_flags.xml SPEChpc is a trademark of the Standard Performance Evaluation Corporation. All other brand and product names appearing in this result are trademarks or registered trademarks of their respective holders. ------------------------------------------------------------------------------------------------------------- For questions about this result, please contact the tester. For other inquiries, please contact info@spec.org. Copyright 2021-2023 Standard Performance Evaluation Corporation Tested with SPEChpc2021 v1.0.1 on 2021-08-20 06:17:40-0400. Report generated on 2023-08-25 18:57:48 by hpc2021 ASCII formatter v1.0.3. Originally published on 2021-10-20.