SPEC® MPIM2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

Intel Corporation

Endeavor (Intel Xeon Gold 6148, 2.40 GHz,
DDR4-2666 MHz, SMT on, Turbo on)

SPECmpiM_peak2007 = Not Run

MPI2007 license: 13 Test date: Aug-2018
Test sponsor: Intel Corporation Hardware Availability: Aug-2018
Tested by: Intel Corporation Software Availability: Nov-2018
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
104.milc 640 15.6  100   15.8  99.3 15.9  98.5
107.leslie3d 640 34.7  150   35.9  145   36.1  145  
113.GemsFDTD 640 187    33.8 187    33.7 187    33.8
115.fds4 640 32.9  59.2 34.4  56.7 35.9  54.4
121.pop2 640 65.1  63.4 64.9  63.6 65.4  63.1
122.tachyon 640 17.1  163   17.5  160   16.9  166  
126.lammps 640 90.0  32.4 89.9  32.4 90.0  32.4
127.wrf2 640 34.3  228   33.9  230   33.8  231  
128.GAPgeofem 640 7.99 258   7.95 260   7.90 261  
129.tera_tf 640 23.3  119   23.5  118   23.2  119  
130.socorro 640 34.0  112   34.0  112   33.5  114  
132.zeusmp2 640 21.1  147   21.1  147   21.0  148  
137.lu 640 19.1  192   19.3  190   19.1  193  
Hardware Summary
Type of System: Homogeneous
Compute Node: Intel Server System R2208WFTZS
Interconnect: Intel Omni-Path 100 series
File Server Node: Lustre FS
Total Compute Nodes: 16
Total Chips: 32
Total Cores: 640
Total Threads: 1280
Total Memory: 3 TB
Base Ranks Run: 640
Minimum Peak Ranks: --
Maximum Peak Ranks: --
Software Summary
C Compiler: Intel C++ Composer XE 2018 for Linux
Version 18.0.0 Build 20170811
C++ Compiler: Intel C++ Composer XE 2018 for Linux
Version 18.0.0 Build 20170811
Fortran Compiler: Intel Fortran Composer XE 2018 for Linux
Version 18.0.0 Build 20170811
Base Pointers: 64-bit
Peak Pointers: 64-bit
MPI Library: Intel MPI Library 2019 Build 20180829
Other MPI Info: libfabric-1.6.1
Pre-processors: No
Other Software: None

Node Description: Intel Server System R2208WFTZS

Hardware
Number of nodes: 16
Uses of the node: Compute
Vendor: Intel
Model: Intel Server System R2208WFTZS
CPU Name: Intel Xeon Gold 6148
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 40
Cores per chip: 20
Threads per core: 2
CPU Characteristics: Intel Turbo Boost Technology up to 3.7 GHz
CPU MHz: 2400
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 1 MB I+D on chip per core
L3 Cache: 27.5 MB I+D on chip per chip
Other Cache: None
Memory: 192 GB (16 x 12 GB 2Rx4 DDR4-2666)
Disk Subsystem: ATA INTEL SSDSC2BA80
Other Hardware: None
Adapter: Intel Omni-Path Edge Switch 100 series
Number of Adapters: 1
Slot Type: PCI-Express x16
Data Rate: 12.5 GB/s
Ports Used: 1
Interconnect Type: Intel Omni-Path Fabric 100 series
Software
Adapter: Intel Omni-Path Edge Switch 100 series
Adapter Driver: IFS 10.7
Adapter Firmware: 1.26.1
Operating System: Oracle Linux Server release 7.4
Local File System: Linux/xfs
Shared File System: Lustre FS
System State: Multi-User
Other Software: IBM Platform LSF Standard 9.1.1.1

Node Description: Lustre FS

Hardware
Number of nodes: 11
Uses of the node: Fileserver
Vendor: Intel
Model: Intel Server System R2208GZ4GC4
CPU Name: Intel Xeon E5-2680
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 16
Cores per chip: 8
Threads per core: 2
CPU Characteristics: Intel Turbo Boost Technology up to 3.5 GHz
CPU MHz: 2700
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 2 MB I+D on chip per chip
L3 Cache: None
Other Cache: None
Memory: 64 GB per node (8 x 8 GB 1600MHz Reg ECC DDR3)
Disk Subsystem: 136 TB 3 RAID with 8 SAS/SATA
Other Hardware: None
Adapter: Intel Omni-Path Fabric Adapter 100 series
Number of Adapters: 1
Slot Type: PCI-Express x16
Data Rate: 12.5 GB/s
Ports Used: 1
Interconnect Type: Intel Omni-Path Fabric 100 series
Software
Adapter: Intel Omni-Path Fabric Adapter 100 series
Adapter Driver: IFS 10.7
Adapter Firmware: 1.26.1
Operating System: Redhat Enterprise Linux Server Release 7.4
Local File System: None
Shared File System: Lustre FS
System State: Multi-User
Other Software: None

Interconnect Description: Intel Omni-Path 100 series

Hardware
Vendor: Intel
Model: Intel Omni-Path Fabric 100 series
Switch Model: Intel Omni-Path Edge Switch 100 series
Number of Switches: 24
Number of Ports: 48
Data Rate: 12.5 GB/s
Firmware: 1.26.1
Topology: Fat tree
Primary Use: MPI and I/O traffic

Submit Notes

The config file option 'submit' was used.

General Notes

 130.socorro (base): "nullify_ptrs" src.alt was used.
 129.tera_tf (base): "add_rank_support" src.alt was used.
 143.dleslie (base): "integer_overflow" src.alt was used.

 MPI startup command:
    mpiexec.hydra command was used to start MPI jobs.
    export I_MPI_FABRICS=shm:ofi
    export FI_PSM2_INJECT_SIZE=8192
    export I_MPI_PIN_DOMAIN=core
    export I_MPI_PIN_ORDER=bunch
    export FI_PSM2_DELAY=0
    export FI_PSM2_LAZY_CONN=1
    export I_MPI_COMPATIBILITY=3
 Spectre & Meltdown:
    Kernel: 3.10.0-862.11.6.el7.crt1.x86_64
    Microcode: 0x200004d
    l1tf: Mitigation: PTE Inversion
    meltdown: Mitigation: PTI
    spec_store_bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp
    spectre_v1: Mitigation: Load fences, __user pointer sanitization
    spectre_v2: Mitigation: IBRS (kernel)
 BIOS settings:
    Intel Hyper-Threading Technology (SMT) = Enabled (default is Enabled)
    Intel Turbo Boost Technology (Turbo)   = Enabled (default is Enabled)
  RAM configuration:
    Compute nodes have 2x16-GB RDIMM on each memory channel.
  Network:
    Endeavour Omni-Path Fabric consists of 48-port switches = 24 core switches
    connected to each leaf of the rack switch.
  HFI driver parameters:
    cache_size = 1024
    rcvhdrcnt = 4096
  Job placement:
    Each MPI job was assigned to a topologically compact set of nodes, i.e.
    the minimal needed number of leaf switches was used for each job = 1 switch
    for 40/80/160/320/640 ranks, 2 switches for 1280 and 1980 ranks.
  IBM Platform LSF was used for job submission. It has no impact on performance.
    Information can be found at: http://www.ibm.com

Base Compiler Invocation

C benchmarks:

 mpiicc 

C++ benchmarks:

126.lammps:  mpiicpc 

Fortran benchmarks:

 mpiifort 

Benchmarks using both Fortran and C:

 mpiicc   mpiifort 

Base Portability Flags

121.pop2:  -DSPEC_MPI_CASE_FLAG 
126.lammps:  -DMPICH_IGNORE_CXX_SEEK 
127.wrf2:  -DSPEC_MPI_CASE_FLAG   -DSPEC_MPI_LINUX 
130.socorro:  -assume nostd_intent_in 

Base Optimization Flags

C benchmarks:

 -O3   -xCORE-AVX512   -no-prec-div   -ipo 

C++ benchmarks:

126.lammps:  -O3   -xCORE-AVX512   -no-prec-div   -ipo 

Fortran benchmarks:

 -O3   -xCORE-AVX512   -no-prec-div   -ipo 

Benchmarks using both Fortran and C:

 -O3   -xCORE-AVX512   -no-prec-div   -ipo 

The flags file that was used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/EM64T_Intel140_flags.20190110.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/mpi2007/flags/EM64T_Intel140_flags.20190110.xml.