SPEC® MPIM2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

SGI

SGI Altix ICE 8400EX
(Intel Xeon X5690, 3.46 GHz)

MPI2007 license: 4 Test date: Jun-2011
Test sponsor: SGI Hardware Availability: Feb-2011
Tested by: SGI Software Availability: Aug-2011
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
104.milc 768 18.7 83.6 15.8 98.9 15.9 98.6 768 18.7 83.6 15.8 98.9 15.9 98.6
107.leslie3d 768 53.8 96.9 53.9 96.8 53.6 97.5 768 53.8 96.9 53.9 96.8 53.6 97.5
113.GemsFDTD 768 360   17.5 358   17.6 359   17.6 192 238   26.5 238   26.5 238   26.5
115.fds4 768 14.9 131   15.7 124   15.1 129   696 14.6 133   20.7 94.0 14.8 132  
121.pop2 768 114   36.3 114   36.3 114   36.2 768 114   36.3 114   36.3 114   36.2
122.tachyon 768 28.2 99.1 28.3 99.0 28.3 98.9 768 28.2 99.1 28.3 99.0 28.3 98.9
126.lammps 768 130   22.4 130   22.4 129   22.5 192 127   22.9 127   22.9 127   22.9
127.wrf2 768 53.9 145   53.8 145   54.0 144   768 53.9 145   53.8 145   54.0 144  
128.GAPgeofem 768 15.2 136   15.0 137   15.0 138   768 15.2 136   15.0 137   15.0 138  
129.tera_tf 768 32.2 86.0 32.2 86.0 32.2 85.9 768 32.2 86.0 32.2 86.0 32.2 85.9
130.socorro 768 63.8 59.8 64.1 59.6 64.6 59.1 768 63.8 59.8 64.1 59.6 64.6 59.1
132.zeusmp2 768 33.3 93.1 33.3 93.1 33.4 92.8 528 32.4 95.6 32.4 95.7 32.3 96.1
137.lu 768 35.9 102   35.9 102   35.9 102   504 34.1 108   34.1 108   34.1 108  
Hardware Summary
Type of System: Homogeneous
Compute Node: SGI Altix ICE 8400EX Compute Node
Interconnect: InfiniBand (MPI and I/O)
File Server Node: SGI InfiniteStorage Nexis 2000 NAS
Total Compute Nodes: 64
Total Chips: 128
Total Cores: 768
Total Threads: 1536
Total Memory: 1536 GB
Base Ranks Run: 768
Minimum Peak Ranks: 192
Maximum Peak Ranks: 768
Software Summary
C Compiler: Intel C++ Composer XE 2011 for Linux,
Version 12.0.3.174 Build 20110309
C++ Compiler: Intel C++ Composer XE 2011 for Linux,
Version 12.0.3.174 Build 20110309
Fortran Compiler: Intel Fortran Composer XE 2011 for Linux,
Version 12.0.3.174 Build 20110309
Base Pointers: 64-bit
Peak Pointers: 64-bit
MPI Library: SGI MPT 2.04 Patch 10789
Other MPI Info: OFED 1.4.2
Pre-processors: None
Other Software: None

Node Description: SGI Altix ICE 8400EX Compute Node

Hardware
Number of nodes: 64
Uses of the node: compute
Vendor: SGI
Model: SGI Altix ICE 8400EX (Intel Xeon X5690, 3.46 GHz)
CPU Name: Intel Xeon X5690
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 12
Cores per chip: 6
Threads per core: 2
CPU Characteristics: Six Core, 3.46 GHz, 6.4 GT/s QPI
Intel Turbo Boost Technology up to 3.73 GHz
Hyper-Threading Technology enabled
CPU MHz: 3467
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 12 MB I+D on chip per chip
Other Cache: None
Memory: 24 GB (6 x 4 GB 2Rx4 PC3-10600R-9, ECC)
Disk Subsystem: None
Other Hardware: None
Adapter: Mellanox MT26428 ConnectX IB QDR
(PCIe x8 Gen2 5 GT/s)
Number of Adapters: 2
Slot Type: PCIe x8 Gen2
Data Rate: InfiniBand 4x QDR
Ports Used: 1
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT26428 ConnectX IB QDR
(PCIe x8 Gen2 5 GT/s)
Adapter Driver: OFED-1.4.2
Adapter Firmware: 2.7.8200
Operating System: SUSE Linux Enterprise Server 11 SP1,
Kernel 2.6.32.13-0.4-default
Local File System: NFSv3
Shared File System: NFSv3 IPoIB
System State: Multi-user, run level 3
Other Software: SGI ProPack 7SP1 for Linux,
Build 701r3.sles11-1005252113
SGI Tempo Compute Node 2.1,
Build 701r3.sles11-1005252113

Node Description: SGI InfiniteStorage Nexis 2000 NAS

Hardware
Number of nodes: 1
Uses of the node: fileserver
Vendor: SGI
Model: SGI Altix XE 270 (Intel Xeon X5670, 2.93 GHz)
CPU Name: Intel Xeon X5670
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 12
Cores per chip: 6
Threads per core: 2
CPU Characteristics: Intel Turbo Boost Technology up to 3.33 GHz
Hyper-Threading Technology enabled
CPU MHz: 2933
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per chip
L3 Cache: 12 MB I+D on chip per chip
Other Cache: None
Memory: 96 GB (12*8 GB DDR3-1333 CL9 DIMMs)
Disk Subsystem: 8.8 TB RAID 5
60 x 146 GB SAS (Seagate Cheetah 15K.5)
Other Hardware: None
Adapter: Mellanox MT26428 ConnectX IB QDR
(PCIe x8 Gen2 5 GT/s)
Number of Adapters: 2
Slot Type: PCIe x8 Gen2
Data Rate: InfiniBand 4x QDR
Ports Used: 2
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT26428 ConnectX IB QDR
(PCIe x8 Gen2 5 GT/s)
Adapter Driver: OFED-1.4.0
Adapter Firmware: 2.7.0
Operating System: SUSE Linux Enterprise Server 11 (x86_64)
Kernel 2.6.27.19-5-default
Local File System: xfs
Shared File System: --
System State: Multi-user, run level 3
Other Software: SGI Foundation Software 2, Build
700r3.sles11-1004061553

Interconnect Description: InfiniBand (MPI and I/O)

Hardware
Vendor: Mellanox Technologies and SGI
Model: None
Switch Model: SGI QDR_1.5_HYPR_2454 with Mellanox Device 48438
(Infiniscale IV)
Number of Switches: 16
Number of Ports: 36
Data Rate: InfiniBand 4x QDR
Firmware: 5040005
Topology: Enhanced Hypercube
Primary Use: MPI and I/O traffic

Submit Notes

The config file option 'submit' was used.

General Notes

 Software environment:
   export MPI_REQUEST_MAX=65536
   export MPI_TYPE_MAX=32768
   export MPI_BUFS_THRESHOLD=1
   export MPI_IB_RAILS=2
   ulimit -s unlimited

 BIOS settings:
   AMI BIOS version 080016
   Hyper-Threading Technology enabled (default)
   Intel Turbo Boost Technology enabled (default)
   Intel Turbo Boost Technology activated in the OS via
     /etc/init.d/acpid start
     /etc/init.d/powersaved start
     powersave -f

 Job Placement:
   Each MPI job was assigned to a topologically compact set
   of nodes, i.e. the minimal needed number of switches was
   used for each job: 2 switches for up to 96 ranks,
   4 switches for 192 ranks, 8 switches for 384 ranks,
   16 switches for 768 ranks.

 Additional notes regarding interconnect:
   The Infiniband network consists of two independent planes,
   with half the switches in the system allocated to each plane.
   I/O traffic is restricted to one plane, while MPI traffic can
   use both planes.

 Peak run:
   In the peak run, some benchmarks used different number of ranks
   from base. It is the only difference between base and peak.

Compiler Invocation

C benchmarks:

 icc 

C++ benchmarks:

126.lammps:  icpc 

Fortran benchmarks:

 ifort 

Benchmarks using both Fortran and C:

 icc   ifort 

Portability Flags

121.pop2:  -DSPEC_MPI_CASE_FLAG 
127.wrf2:  -DSPEC_MPI_CASE_FLAG   -DSPEC_MPI_LINUX 

Base Optimization Flags

C benchmarks:

 -O3   -xSSE4.2   -no-prec-div 

C++ benchmarks:

126.lammps:  -O3   -xSSE4.2   -no-prec-div   -ansi-alias 

Fortran benchmarks:

 -O3   -xSSE4.2   -no-prec-div 

Benchmarks using both Fortran and C:

 -O3   -xSSE4.2   -no-prec-div 

Peak Optimization Flags

C benchmarks:

104.milc:  basepeak = yes 
122.tachyon:  basepeak = yes 

C++ benchmarks:

126.lammps:  -O3   -xSSE4.2   -no-prec-div   -ansi-alias 

Fortran benchmarks:

107.leslie3d:  basepeak = yes 
113.GemsFDTD:  -O3   -xSSE4.2   -no-prec-div 
129.tera_tf:  basepeak = yes 
137.lu:  Same as 113.GemsFDTD 

Benchmarks using both Fortran and C:

115.fds4:  -O3   -xSSE4.2   -no-prec-div 
121.pop2:  basepeak = yes 
127.wrf2:  basepeak = yes 
128.GAPgeofem:  basepeak = yes 
130.socorro:  basepeak = yes 
132.zeusmp2:  Same as 115.fds4 

Other Flags

C benchmarks:

 -lmpi 

C++ benchmarks:

126.lammps:  -lmpi 

Fortran benchmarks:

 -lmpi 

Benchmarks using both Fortran and C:

 -lmpi 

The flags file that was used to format this result can be browsed at
http://www.spec.org/mpi2007/results/flags/SGI_x86_64_Intel12_flags.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/mpi2007/results/flags/SGI_x86_64_Intel12_flags.xml.