SPEC® MPIM2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

SGI

SGI ICE X
(Intel Xeon E5-2690 v3, 2.6 GHz)

MPI2007 license: 14 Test date: Jul-2014
Test sponsor: SGI Hardware Availability: Sep-2014
Tested by: SGI Software Availability: Apr-2014
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
104.milc 800 10.7  146   10.7  146   10.7  146   800 10.7  146   10.7  146   10.7  146  
107.leslie3d 800 28.8  181   28.8  182   27.9  187   800 28.8  181   28.8  182   27.9  187  
113.GemsFDTD 800 219    28.8 219    28.8 219    28.8 160 146    43.2 146    43.2 146    43.1
115.fds4 800 10.7  182   10.9  179   10.9  179   800 10.7  182   10.9  179   10.9  179  
121.pop2 800 74.3  55.5 74.3  55.5 74.5  55.4 800 74.3  55.5 74.3  55.5 74.5  55.4
122.tachyon 800 22.8  123   22.8  123   22.8  123   800 22.8  123   22.8  123   22.8  123  
126.lammps 800 99.1  29.4 101    28.8 99.0  29.4 800 99.1  29.4 101    28.8 99.0  29.4
127.wrf2 800 29.6  263   29.6  263   29.4  265   800 29.6  263   29.6  263   29.4  265  
128.GAPgeofem 800 9.04 228   9.18 225   9.21 224   800 9.04 228   9.18 225   9.21 224  
129.tera_tf 800 23.2  119   23.2  119   23.3  119   800 23.2  119   23.2  119   23.3  119  
130.socorro 800 35.3  108   35.3  108   35.2  108   800 35.3  108   35.3  108   35.2  108  
132.zeusmp2 800 24.2  128   24.1  129   24.2  128   800 24.2  128   24.1  129   24.2  128  
137.lu 800 23.7  155   23.7  155   23.6  156   800 23.7  155   23.7  155   23.6  156  
Hardware Summary
Type of System: Homogeneous
Compute Node: SGI ICE X IP-131 Compute Node
Interconnect: InfiniBand (MPI and I/O)
File Server Node: SGI Rackable C1103-TY12
Total Compute Nodes: 40
Total Chips: 80
Total Cores: 960
Total Threads: 960
Total Memory: 5 TB
Base Ranks Run: 800
Minimum Peak Ranks: 160
Maximum Peak Ranks: 800
Software Summary
C Compiler: Intel C++ Composer XE 2013 for Linux,
Version 14.0.3.174 Build 20140422
C++ Compiler: Intel C++ Composer XE 2013 for Linux
Version 14.0.3.174 Build 20140422
Fortran Compiler: Intel Fortran Composer XE 2013 for Linux,
Version 14.0.3.174 Build 20140422
Base Pointers: 64-bit
Peak Pointers: Not Applicable
MPI Library: SGI MPT 2.09 Patch 11049
Other MPI Info: OFED 1.5.4
Pre-processors: None
Other Software: None

Node Description: SGI ICE X IP-131 Compute Node

Hardware
Number of nodes: 40
Uses of the node: compute
Vendor: SGI
Model: SGI ICE X (Intel Xeon E6-2690 v3, 2.6 GHz)
CPU Name: Intel Xeon E5-2690 v3
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 24
Cores per chip: 12
Threads per core: 1
CPU Characteristics: 12 Core, 2.60 GHz, 9.6 GT/s QPI
Intel Turbo Boost Technology up to 3.50 GHz
Hyper-Threading Technology disabled
CPU MHz: 2600
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 30 MB I+D on chip per chip
Other Cache: None
Memory: 128 GB (8 x 16 GB 2Rx4 PC4-17000R-15, ECC)
Disk Subsystem: None
Other Hardware: None
Adapter: Mellanox MT27500 with ConnectX-3 ASIC
(PCIe x8 Gen3 8 GT/s)
Number of Adapters: 2
Slot Type: PCIe x8 Gen3
Data Rate: InfiniBand 4x FDR
Ports Used: 2
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT27500 with ConnectX-3 ASIC
(PCIe x8 Gen3 8 GT/s)
Adapter Driver: OFED-1.5.4
Adapter Firmware: 2.30.3000
Operating System: SUSE Linux Enterprise Server 11 SP3 (x86_64),
Kernel 3.0.93-0.8-default
Local File System: NFSv3
Shared File System: NFSv3 IPoIB
System State: Multi-user, run level 3
Other Software: SGI Tempo Service Node 2.8.1,
Build 709rp49.sles11sp3-1402182002

Node Description: SGI Rackable C1103-TY12

Hardware
Number of nodes: 1
Uses of the node: fileserver
Vendor: SGI
Model: SGI Rackable C1103-TY12 (Intel Xeon X5670, 2.93
GHz)
CPU Name: Intel Xeon X5670
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 12
Cores per chip: 6
Threads per core: 2
CPU Characteristics: Intel Turbo Boost Technology up to 3.33 GHz
Hyper-Threading Technology enabled
CPU MHz: 2933
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per chip
L3 Cache: 12 MB I+D on chip per chip
Other Cache: None
Memory: 96 GB (12 * 8 GB 2Rx4 PC3-10600R-9, ECC)
Disk Subsystem: 12 TB RAID 6
12 x 1 TB SATA (Seagate Constellation, 7200RPM)
Other Hardware: None
Adapter: Mellanox MT27500 with ConnectX-3 ASIC
(PCIe x8 Gen3 8 GT/s)
Number of Adapters: 2
Slot Type: PCIe x8 Gen3
Data Rate: InfiniBand 4x FDR
Ports Used: 2
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT27500 with ConnectX-3 ASIC
(PCIe x8 Gen3 8 GT/s)
Adapter Driver: OFED-1.5.2
Adapter Firmware: 2.30.3000
Operating System: SUSE Linux Enterprise Server 11 SP1 (x86_64),
Kernel 2.6.32.46-0.3-default
Local File System: xfs
Shared File System: --
System State: Multi-user, run level 3
Other Software: SGI Foundation Software 2.5,
Build 705r10.sles11-1110192111

Interconnect Description: InfiniBand (MPI and I/O)

Hardware
Vendor: Mellanox Technologies and SGI
Model: None
Switch Model: SGI FDR Integrated IB Switch Blade 2SW9x27 with
Mellanox SwitchX device 51000
Number of Switches: 10
Number of Ports: 36
Data Rate: InfiniBand 4x FDR
Firmware: 09.02.3000
Topology: Enhanced Hypercube
Primary Use: MPI and I/O traffic

Submit Notes

The config file option 'submit' was used.

General Notes

 Software environment:
   export MPI_REQUEST_MAX=65536
   export MPI_TYPE_MAX=32768
   export MPI_IB_RAILS=2
   export MPI_CONNECTIONS_THRESHOLD=0
   ulimit -s unlimited

 BIOS settings:
   AMI BIOS version DY2E6044
   Hyper-Threading Technology disabled
   Intel Turbo Boost Technology enabled (default)
   Intel Turbo Boost Technology activated with
     modprobe acpi_cpufreq
     cpupower frequency-set -u 2601MHz -d 2601MHz -g performance

 Job Placement:
   Ten ranks were assigned to each CPU chip, leaving 2
   cores per chip idle. There were 10 switches used
   with a topologically compact configuration.

 Additional notes regarding interconnect:
   The Infiniband network consists of two independent planes,
   with half the switches in the system allocated to each plane.
   I/O traffic is restricted to one plane, while MPI traffic can
   use both planes.

 Peak run:
   In the peak run, some benchmarks used different number of ranks
   from base. It is the only difference between base and peak.

Compiler Invocation

C benchmarks:

 icc 

C++ benchmarks:

126.lammps:  icpc 

Fortran benchmarks:

 ifort 

Benchmarks using both Fortran and C:

 icc   ifort 

Portability Flags

121.pop2:  -DSPEC_MPI_CASE_FLAG 
127.wrf2:  -DSPEC_MPI_CASE_FLAG   -DSPEC_MPI_LINUX 
130.socorro:  -assume nostd_intent_in 

Base Optimization Flags

C benchmarks:

 -O3   -xCORE-AVX2   -no-prec-div 

C++ benchmarks:

126.lammps:  -O3   -xCORE-AVX2   -no-prec-div   -ansi-alias 

Fortran benchmarks:

 -O3   -xCORE-AVX2   -no-prec-div 

Benchmarks using both Fortran and C:

 -O3   -xCORE-AVX2   -no-prec-div 

Peak Optimization Flags

C benchmarks:

104.milc:  basepeak = yes 
122.tachyon:  basepeak = yes 

C++ benchmarks:

126.lammps:  basepeak = yes 

Fortran benchmarks:

107.leslie3d:  basepeak = yes 
113.GemsFDTD:  -O3   -xCORE-AVX2   -no-prec-div 
129.tera_tf:  basepeak = yes 
137.lu:  basepeak = yes 

Benchmarks using both Fortran and C:

115.fds4:  basepeak = yes 
121.pop2:  basepeak = yes 
127.wrf2:  basepeak = yes 
128.GAPgeofem:  basepeak = yes 
130.socorro:  basepeak = yes 
132.zeusmp2:  basepeak = yes 

Other Flags

C benchmarks:

 -lmpi 

C++ benchmarks:

126.lammps:  -lmpi 

Fortran benchmarks:

 -lmpi 

Benchmarks using both Fortran and C:

 -lmpi 

The flags file that was used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/SGI_x86_64_Intel14_flags.20140908.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/mpi2007/flags/SGI_x86_64_Intel14_flags.20140908.xml.