SPEChpc™ 2021 Small Result

Copyright 2021-2023 Standard Performance Evaluation Corporation

Lenovo Global Technology

ThinkSystem SR670 V2 (Intel Xeon Platinum 8380, Nvidia A100-PCIE-80G)

SPEChpc 2021_sml_base = 8.17

SPEChpc 2021_sml_peak = Not Run

hpc2021 License: 28 Test Date: Aug-2021
Test Sponsor: Lenovo Global Technology Hardware Availability: Aug-2021
Tested by: Lenovo Global Technology Software Availability: Aug-2021

Benchmark result graphs are available in the PDF report.

Results Table

Benchmark Base Peak
Model Ranks Thrds/Rnk Seconds Ratio Seconds Ratio Seconds Ratio Model Ranks Thrds/Rnk Seconds Ratio Seconds Ratio Seconds Ratio
SPEChpc 2021_sml_base 8.17
SPEChpc 2021_sml_peak Not Run
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
605.lbm_s ACC 8 1 91.7 16.90 91.9 16.90 91.5 16.90
613.soma_s ACC 8 1 1270 12.60 1270 12.60 1270 12.60
618.tealeaf_s ACC 8 1 6140 3.34 6140 3.34 6140 3.34
619.clvleaf_s ACC 8 1 1700 9.70 1690 9.78 1710 9.67
621.miniswp_s ACC 8 1 2170 5.07 2190 5.03 2180 5.04
628.pot3d_s ACC 8 1 1760 9.52 1760 9.53 1760 9.52
632.sph_exa_s ACC 8 1 5140 4.47 5150 4.47 5110 4.50
634.hpgmgfv_s ACC 8 1 2430 4.02 2430 4.01 2420 4.03
635.weather_s ACC 8 1 95.4 27.30 95.4 27.30 95.3 27.30
Hardware Summary
Type of System: Homogenous
Compute Node: ThinkSystem SR670 V2
Interconnect: None
File Server Node: ThinkSystem SR670 V2
Compute Nodes Used: 1
Total Chips: 2
Total Cores: 80
Total Threads: 80
Total Memory: 512 GB
Software Summary
Compiler: Nvidia HPC SDK 21.5
MPI Library: Open MPI 4.0.5
Other MPI Info: None
Base Parallel Model: ACC
Base Ranks Run: 8
Base Threads Run: 1
Peak Parallel Models: Not Run

Node Description: ThinkSystem SR670 V2

Hardware
Number of nodes: 1
Uses of the node: compute
Vendor: Lenovo Global Technology
Model: ThinkSystem SR670 V2
CPU Name: Intel Xeon Platinum 8380
CPU(s) orderable: 2 chips
Chips enabled: 2
Cores enabled: 80
Cores per chip: 40
Threads per core: 1
CPU Characteristics: Intel Turbo Boost Technology up to 3.4 GHz
CPU MHz: 2300
Primary Cache: 32 KB I + 48 KB D on chip per core
Secondary Cache: 1280 KB I+D on chip per core
L3 Cache: 60 MB I+D on chip per chip
Other Cache: None
Memory: 512 GB (16 x 32 GB 2Rx8 PC4-3200A-R)
Disk Subsystem: 1 x 4 TB NVMe SSD
Other Hardware: None
Accel Count: 8
Accel Model: Tesla A100 PCIe 80GB
Accel Vendor: Nvidia Corporation
Accel Type: GPU
Accel Connection: PCIe Gen4 x16
Accel ECC enabled: Yes
Accel Description: Nvidia Tesla A100 PCIe 80GB
Adapter: Mellanox ConnectX-6 HDR
Number of Adapters: 1
Slot Type: PCI-Express 4.0 x16
Data Rate: 200 Gb/s
Ports Used: 1
Interconnect Type: Nvidia Mellanox ConnectX-6 HDR
Software
Accelerator Driver: 470.42.01
Adapter: Mellanox ConnectX-6 HDR
Adapter Driver: 5.2-1.0.4
Adapter Firmware: 20.28.1002
Operating System: Red Hat Enterprise Linux Server release 8.3,
Kernel 4.18.0-193.el8.x86_64
Local File System: xfs
Shared File System: XFS
System State: Multi-user, run level 3
Other Software: None

Node Description: ThinkSystem SR670 V2

Hardware
Number of nodes: 1
Uses of the node: Fileserver
Vendor: Lenovo Global Technology
Model: ThinkSystem SR670 V2
CPU Name: Intel Xeon Platinum 8380
CPU(s) orderable: 2 chips
Chips enabled: 2
Cores enabled: 80
Cores per chip: 40
Threads per core: 1
CPU Characteristics: Turbo up to 3.4 GHz
CPU MHz: 2300
Primary Cache: 32 KB I + 48 KB D on chip per core
Secondary Cache: 1280 KB I+D on chip per core
L3 Cache: 60 MB I+D on chip per chip
Other Cache: None
Memory: 512 GB (16 x 32 GB 2Rx8 PC4-3200A-R)
Disk Subsystem: 1 x 4 TB NVMe SSD
Other Hardware: None
Accel Count: 8
Accel Model: Tesla A100 PCIe 80GB
Accel Vendor: Nvidia
Accel Type: GPU
Accel Connection: Nvidia Tesla A100 PCIe 80GB
Accel ECC enabled: Yes
Accel Description: Nvidia Tesla A100 PCIe 80GB
Adapter: Mellanox ConnectX-6 HDR
Number of Adapters: 1
Slot Type: PCI-Express 4.0 x16
Data Rate: 200 Gb/s
Ports Used: 1
Interconnect Type: Nvidia Mellanox ConnectX-6 HDR
Software
Accelerator Driver: None
Adapter: Mellanox ConnectX-6 HDR
Adapter Driver: 5.2-1.0.4
Adapter Firmware: 20.28.1002
Operating System: Red Hat Enterprise Linux Server release 8.3
Local File System: xfs
Shared File System: None
System State: Multi-User, run level 3
Other Software: None

Interconnect Description: None

Submit Notes

Indiviual Ranks were bound to the CPU cores on the same NUMA node as
the GPU using 'numactl' within the following "bind.pl" perl script:
---- Start bind.pl ------
my %bind;
$bind{0} = "1-3";
$bind{1} = "4-7";
$bind{2} = "8-10";
$bind{3} = "11-14";
$bind{4} = "41-43";
$bind{5} = "44-47";
$bind{6} = "61-63";
$bind{7} = "64-67";
my $rank = $ENV{OMPI_COMM_WORLD_LOCAL_RANK};
my $cmd = "taskset -c $bind{$rank} ";
while (my $arg = shift) {
 $cmd .= "$arg ";
}
my $rc = system($cmd);
exit($rc);
---- End bind.pl ------
The config file option 'submit' was used.
submit = mpirun --allow-run-as-root -x UCX_MEMTYPE_CACHE=n
-host localhost:8 -np $ranks perl $[top]/bind.pl $command

General Notes

Environment variables set by runhpc before the start of the run:
UCX_MEMTYPE_CACHE = "n"
UCX_TLS = "self,shm,cuda_copy"

Compiler Version Notes

==============================================================================
 CC  605.lbm_s(base) 613.soma_s(base) 618.tealeaf_s(base) 621.miniswp_s(base)
      634.hpgmgfv_s(base)
------------------------------------------------------------------------------
nvc 21.5-0 LLVM 64-bit target on x86-64 Linux -tp skylake 
NVIDIA Compilers and Tools
Copyright (c) 2021, NVIDIA CORPORATION.  All rights reserved.
------------------------------------------------------------------------------

==============================================================================
 CXXC 632.sph_exa_s(base)
------------------------------------------------------------------------------
nvc++ 21.5-0 LLVM 64-bit target on x86-64 Linux -tp skylake 
NVIDIA Compilers and Tools
Copyright (c) 2021, NVIDIA CORPORATION.  All rights reserved.
------------------------------------------------------------------------------

==============================================================================
 FC  619.clvleaf_s(base) 628.pot3d_s(base) 635.weather_s(base)
------------------------------------------------------------------------------
nvfortran 21.5-0 LLVM 64-bit target on x86-64 Linux -tp skylake 
NVIDIA Compilers and Tools
Copyright (c) 2021, NVIDIA CORPORATION.  All rights reserved.
------------------------------------------------------------------------------

Base Compiler Invocation

C benchmarks:

 mpicc 

C++ benchmarks:

 mpicxx 

Fortran benchmarks:

 mpif90 

Base Portability Flags

621.miniswp_s:  -DUSE_KBA   -DUSE_ACCELDIR 
632.sph_exa_s:  -DSPEC_USE_LT_IN_KERNELS   --c++17 

Base Optimization Flags

C benchmarks:

 -Mfprelaxed   -Mnouniform   -Mstack_arrays   -fast   -acc=gpu   -DSPEC_ACCEL_AWARE_MPI 

C++ benchmarks:

 -Mfprelaxed   -Mnouniform   -Mstack_arrays   -fast   -acc=gpu   -DSPEC_ACCEL_AWARE_MPI 

Fortran benchmarks:

 -DSPEC_ACCEL_AWARE_MPI   -Mfprelaxed   -Mnouniform   -Mstack_arrays   -fast   -acc=gpu 

Base Other Flags

C benchmarks:

 -w 

C++ benchmarks:

 -w 

Fortran benchmarks:

 -w 

The flags file that was used to format this result can be browsed at
http://www.spec.org/hpc2021/flags/nv2021_flags.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/hpc2021/flags/nv2021_flags.xml.