SPEC® MPIM2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

SGI

SGI ICE XA
(Intel Xeon E5-2690 v4, 2.6 GHz)

MPI2007 license: 14 Test date: Jun-2016
Test sponsor: SGI Hardware Availability: May-2016
Tested by: SGI Software Availability: Jun-2016
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
104.milc 800 9.35 167   8.29 189   8.21 191   800 9.35 167   8.29 189   8.21 191  
107.leslie3d 800 28.0  187   27.6  189   26.9  194   1120 24.2  216   23.9  218   23.7  221  
113.GemsFDTD 800 206    30.6 206    30.6 206    30.6 320 192    32.9 191    33.1 190    33.1
115.fds4 800 9.15 213   9.27 210   9.27 210   800 9.15 213   9.27 210   9.27 210  
121.pop2 800 74.7  55.2 74.9  55.1 74.7  55.2 512 68.0  60.7 68.1  60.6 68.0  60.7
122.tachyon 800 15.5  181   15.0  187   15.2  184   1120 11.7  240   11.7  240   11.7  239  
126.lammps 800 88.4  33.0 88.1  33.1 88.1  33.1 320 78.3  37.2 78.0  37.4 78.2  37.3
127.wrf2 800 30.1  259   29.3  266   30.3  258   800 30.1  259   29.3  266   30.3  258  
128.GAPgeofem 800 9.03 229   8.88 233   8.85 233   1024 8.05 257   7.81 265   7.82 264  
129.tera_tf 800 19.9  139   19.9  139   19.9  139   1024 17.2  161   17.1  162   17.1  162  
130.socorro 800 23.0  166   23.1  165   23.3  164   640 20.0  191   20.4  187   20.4  187  
132.zeusmp2 800 21.4  145   21.4  145   21.2  146   512 18.8  165   19.1  163   18.9  164  
137.lu 800 21.0  175   21.0  175   21.0  175   512 19.3  190   19.3  190   19.4  190  
Hardware Summary
Type of System: Homogeneous
Compute Node: SGI ICE XA IP-125 CS
Interconnect: InfiniBand (MPI and I/O)
File Server Node: SGI MIS Server
Total Compute Nodes: 40
Total Chips: 80
Total Cores: 1120
Total Threads: 2240
Total Memory: 5 TB
Base Ranks Run: 800
Minimum Peak Ranks: 320
Maximum Peak Ranks: 1120
Software Summary
C Compiler: Intel C++ Composer XE 2016 for Linux,
Version 16.0.3.210 Build 20160415
C++ Compiler: Intel C++ Composer XE 2016 for Linux
Version 16.0.3.210 Build 20160405
Fortran Compiler: Intel Fortran Composer XE 2016 for Linux,
Version 16.0.3.210 Build 20160405
Base Pointers: 64-bit
Peak Pointers: 64-bit
MPI Library: SGI MPT 2.14 Patch 11333
Other MPI Info: OFED 3.2.2
Pre-processors: None
Other Software: None

Node Description: SGI ICE XA IP-125 CS

Hardware
Number of nodes: 40
Uses of the node: compute
Vendor: SGI
Model: SGI ICE XA (Intel Xeon E5-2690 v4, 2.6 GHz)
CPU Name: Intel Xeon E5-2690 v4
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 28
Cores per chip: 14
Threads per core: 2
CPU Characteristics: 14 Core, 2.60 GHz, 9.6 GT/s QPI
Intel Turbo Boost Technology up to 3.50 GHz
Hyper-Threading Technology enabled
CPU MHz: 2600
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 35 MB I+D on chip per chip
Other Cache: None
Memory: 128 GB (8 x 16 GB 2Rx4 PC4-2400T-R)
Disk Subsystem: None
Other Hardware: None
Adapter: Mellanox MT27700 with ConnectX-4
ASIC (PCIe x16 Gen3 8 GT/s)
Number of Adapters: 2
Slot Type: PCIe x16 Gen3
Data Rate: InfiniBand 4X EDR
Ports Used: 1
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT27700 with ConnectX-4
ASIC (PCIe x16 Gen3 8 GT/s)
Adapter Driver: OFED-3.2.1.5.3
Adapter Firmware: 12.14.0114
Operating System: SUSE Linux Enterprise Server 11 SP4 (x86_64),
Kernel 3.0.101-71.1.10690.1.PTF-default
Local File System: NFSv3
Shared File System: NFSv3 IPoIB
System State: Multi-user, run level 3
Other Software: SGI Tempo Compute Node 3.3.0,
Build 714r18.sles11sp4-1604041900

Node Description: SGI MIS Server

Hardware
Number of nodes: 1
Uses of the node: fileserver
Vendor: SGI
Model: SGI MIS Server
CPU Name: Intel Xeon E5-2670
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 16
Cores per chip: 8
Threads per core: 1
CPU Characteristics: Intel Turbo Boost Technology up to 3.30 GHz
Hyper-Threading Technology disabled
CPU MHz: 1200
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 20 MB I+D on chip per chip
Other Cache: None
Memory: 128 GB (12 * 8 GB 2Rx4 PC3-12800R-11, ECC)
Disk Subsystem: 45 TB RAID 6
8 x 6+2 900GB (WD, 10K RPM)
Other Hardware: None
Adapter: Mellanox MT27500 with ConnectX-3 ASIC
Number of Adapters: 2
Slot Type: PCIe x8 Gen3
Data Rate: InfiniBand 4X FDR
Ports Used: 2
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT27500 with ConnectX-3 ASIC
Adapter Driver: OFED-3.2.0.1.1
Adapter Firmware: 2.36.5000
Operating System: SUSE Linux Enterprise Server 11 (x86_64),
Kernel 3.0.101-0.46-default
Local File System: xfs
Shared File System: --
System State: Multi-user, run level 3
Other Software: SGI Foundation Software 2.9,
Build 711r2.sles11sp3-1411192056

Interconnect Description: InfiniBand (MPI and I/O)

Hardware
Vendor: Mellanox Technologies and SGI
Model: None
Switch Model: SGI P0002145
Number of Switches: 10
Number of Ports: 36
Data Rate: InfiniBand 4x EDR
Firmware: 11.0350.0394
Topology: Enhanced Hypercube
Primary Use: MPI and I/O traffic

Submit Notes

The config file option 'submit' was used.

General Notes

Software environment:
  export MPI_REQUEST_MAX=65536
  export MPI_TYPE_MAX=32768
  export MPI_IB_RAILS=2
  export MPI_IB_UPGRADE_SENDS=50
  export MPI_IB_IMM_UPGRADE=false
  export MPI_IB_DCIS=2
  export MPI_CONNECTIONS_THRESHOLD=0
  export MPI_IB_MTU=4096
  ulimit -s unlimited
BIOS settings:
  AMI BIOS version HA012036
  Hyper-Threading Technology enabled
  Intel Turbo Boost Technology enabled (default)
  Transparent Hugepages Enabled
Job Placement:
  Each MPI job was assigned to a topologically compact set
  of nodes.  The base run used 10 ranks per socket and peak
  runs varied between 4 and 14 ranks per socket.  The total
  number of sockets and nodes was constant.
Additional notes regarding interconnect:
  The Infiniband network consists of two independent planes,
  with half the switches in the system allocated to each plane.
  I/O traffic is restricted to one plane, while MPI traffic can
  use both planes.

Compiler Invocation

C benchmarks:

 icc 

C++ benchmarks:

126.lammps:  icpc 

Fortran benchmarks:

 ifort 

Benchmarks using both Fortran and C:

 icc   ifort 

Portability Flags

121.pop2:  -DSPEC_MPI_CASE_FLAG 
127.wrf2:  -DSPEC_MPI_CASE_FLAG   -DSPEC_MPI_LINUX 
130.socorro:  -assume nostd_intent_in 

Base Optimization Flags

C benchmarks:

 -O3   -xCORE-AVX2   -no-prec-div 

C++ benchmarks:

126.lammps:  -O3   -xCORE-AVX2   -no-prec-div   -ansi-alias 

Fortran benchmarks:

 -O3   -xCORE-AVX2   -no-prec-div 

Benchmarks using both Fortran and C:

 -O3   -xCORE-AVX2   -no-prec-div 

Peak Optimization Flags

C benchmarks:

104.milc:  basepeak = yes 
122.tachyon:  -O3   -xCORE-AVX2   -no-prec-div 

C++ benchmarks:

126.lammps:  -O3   -xCORE-AVX2   -no-prec-div   -ansi-alias 

Fortran benchmarks:

 -O3   -xCORE-AVX2   -no-prec-div 

Benchmarks using both Fortran and C:

115.fds4:  basepeak = yes 
121.pop2:  -O3   -xCORE-AVX2   -no-prec-div 
127.wrf2:  basepeak = yes 
128.GAPgeofem:  Same as 121.pop2 
130.socorro:  Same as 121.pop2 
132.zeusmp2:  Same as 121.pop2 

Other Flags

C benchmarks:

 -lmpi 

C++ benchmarks:

126.lammps:  -lmpi 

Fortran benchmarks:

 -lmpi 

Benchmarks using both Fortran and C:

 -lmpi 

The flags file that was used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/SGI_x86_64_Intel14_flags.20140908.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/mpi2007/flags/SGI_x86_64_Intel14_flags.20140908.xml.