MPI2007 license: | 4 | Test date: | Jun-2011 |
---|---|---|---|
Test sponsor: | SGI | Hardware Availability: | Mar-2011 |
Tested by: | SGI | Software Availability: | Aug-2011 |
Benchmark | Base | Peak | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ranks | Seconds | Ratio | Seconds | Ratio | Seconds | Ratio | Ranks | Seconds | Ratio | Seconds | Ratio | Seconds | Ratio | |
Results appear in the order in which they were run. Bold underlined text indicates a median measurement. | ||||||||||||||
121.pop2 | 3072 | 124 | 31.4 | 112 | 34.7 | 112 | 34.7 | 2048 | 93.9 | 41.4 | 94.1 | 41.4 | 96.4 | 40.4 |
122.tachyon | 3072 | 218 | 8.92 | 96.9 | 20.1 | 96.4 | 20.2 | 3072 | 218 | 8.92 | 96.9 | 20.1 | 96.4 | 20.2 |
125.RAxML | 3072 | 148 | 19.7 | 148 | 19.7 | 148 | 19.7 | 3072 | 148 | 19.7 | 148 | 19.7 | 148 | 19.7 |
126.lammps | 3072 | 52.0 | 47.3 | 52.1 | 47.2 | 52.0 | 47.2 | 3072 | 52.0 | 47.3 | 52.1 | 47.2 | 52.0 | 47.2 |
128.GAPgeofem | 3072 | 144 | 41.1 | 144 | 41.1 | 145 | 41.0 | 2048 | 140 | 42.3 | 140 | 42.5 | 140 | 42.4 |
129.tera_tf | 3072 | 74.5 | 14.7 | 74.5 | 14.7 | 74.2 | 14.8 | 3072 | 74.5 | 14.7 | 74.5 | 14.7 | 74.2 | 14.8 |
132.zeusmp2 | 3072 | 61.2 | 34.7 | 61.4 | 34.5 | 61.5 | 34.4 | 2048 | 54.9 | 38.6 | 54.9 | 38.6 | 55.0 | 38.5 |
137.lu | 3072 | 57.1 | 73.6 | 56.9 | 73.9 | 56.7 | 74.2 | 2048 | 51.9 | 81.0 | 51.8 | 81.1 | 51.8 | 81.2 |
142.dmilc | 3072 | 38.2 | 96.6 | 38.7 | 95.2 | 38.4 | 95.8 | 3072 | 38.2 | 96.6 | 38.7 | 95.2 | 38.4 | 95.8 |
143.dleslie | 3072 | 1069 | 2.90 | 1072 | 2.89 | 1072 | 2.89 | 2048 | 57.3 | 54.1 | 70.1 | 44.2 | 57.0 | 54.4 |
145.lGemsFDTD | 3072 | 189 | 23.4 | 195 | 22.7 | 185 | 23.8 | 2048 | 145 | 30.5 | 144 | 30.6 | 144 | 30.6 |
147.l2wrf2 | 3072 | 158 | 51.8 | 158 | 52.1 | 158 | 51.9 | 3072 | 158 | 51.8 | 158 | 52.1 | 158 | 51.9 |
Hardware Summary | |
---|---|
Type of System: | Homogeneous |
Compute Node: | SGI Altix ICE 8400EX Compute Node |
Interconnect: | InfiniBand (MPI and I/O) |
File Server Node: | SGI InfiniteStorage 4000 |
Total Compute Nodes: | 128 |
Total Chips: | 256 |
Total Cores: | 3072 |
Total Threads: | 3072 |
Total Memory: | 8 TB |
Base Ranks Run: | 3072 |
Minimum Peak Ranks: | 2048 |
Maximum Peak Ranks: | 3072 |
Software Summary | |
---|---|
C Compiler: | Intel C++ Composer XE 2011 for Linux, Version 12.0.3.174 Build 20110309 |
C++ Compiler: | Intel C++ Composer XE 2011 for Linux, Version 12.0.3.174 Build 20110309 |
Fortran Compiler: | Intel Fortran Composer XE 2011 for Linux, Version 12.0.3.174 Build 20110309 |
Base Pointers: | 64-bit |
Peak Pointers: | 64-bit |
MPI Library: | SGI MPT 2.04 Patch 10789 |
Other MPI Info: | OFED 1.4.2 |
Pre-processors: | None |
Other Software: | None |
Hardware | |
---|---|
Number of nodes: | 128 |
Uses of the node: | compute |
Vendor: | SGI |
Model: | SGI Altix ICE 8400EX (AMD Opteron 6180 SE, 2.5GHz) |
CPU Name: | AMD Opteron 6180 SE |
CPU(s) orderable: | 1-2 chips |
Chips enabled: | 2 |
Cores enabled: | 24 |
Cores per chip: | 12 |
Threads per core: | 1 |
CPU Characteristics: | 12 Cores/chip, 2.5 GHz |
CPU MHz: | 2500 |
Primary Cache: | 64 KB I + 64 KB D on chip per core |
Secondary Cache: | 512 KB I+D on chip per core |
L3 Cache: | 12 MB I+D on chip per chip, 6 MB shared / 6 cores |
Other Cache: | None |
Memory: | 64 GB (16 x 4 GB, 2Rx4 PC3-10600R-9, ECC) |
Disk Subsystem: | None |
Other Hardware: | None |
Adapter: | Mellanox MT26428 ConnectX IB QDR (PCIe x8 Gen2 5 GT/s) |
Number of Adapters: | 1 |
Slot Type: | PCIe x8 Gen2 |
Data Rate: | InfiniBand 4x QDR |
Ports Used: | 2 |
Interconnect Type: | InfiniBand |
Software | |
---|---|
Adapter: | Mellanox MT26428 ConnectX IB QDR (PCIe x8 Gen2 5 GT/s) |
Adapter Driver: | OFED-1.4.2 |
Adapter Firmware: | 2.7.0 |
Operating System: | SUSE Linux Enterprise Server 11 SP1 (x86_64) Kernel 2.6.32.27-0.2-default |
Local File System: | NFSv3 |
Shared File System: | NFSv3 IPoIB |
System State: | Run Level 3 (Multi-User) |
Other Software: | SGI Performance Suite 1.0, Build 702r19.sles11-1010072114 SGI Tempo Compute Node 2.2, Build 702r19.sles11-1010072114 |
Hardware | |
---|---|
Number of nodes: | 1 |
Uses of the node: | fileserver |
Vendor: | SGI |
Model: | SGI Altix 450 (Intel Itanium 2, 1.6GHz) |
CPU Name: | Intel Itanium 2 9030 |
CPU(s) orderable: | 2-38 chips |
Chips enabled: | 2 |
Cores enabled: | 4 |
Cores per chip: | 2 |
Threads per core: | 1 |
CPU Characteristics: | 1.6GHz/8MB, 533MHz FSB |
CPU MHz: | 1600 |
Primary Cache: | 16 KB I + 16 KB D on chip per core |
Secondary Cache: | 1 MB I + 256 KB D on chip per core |
L3 Cache: | 4 MB I+D on chip per core |
Other Cache: | None |
Memory: | 24 GB (12 x 2 GB, 2Rx4 PC2-3200-3, ECC) |
Disk Subsystem: | 16 TB RAID 5 32 x 500 GB SATA (Seagate Barracuda 7.2K) |
Other Hardware: | None |
Adapter: | Mellanox MT25208 InfiniHost III Ex (PCIe x8 Gen1 2.5 GT/s) |
Number of Adapters: | 2 |
Slot Type: | PCIe x8 Gen1 |
Data Rate: | InfiniBand 4x DDR |
Ports Used: | 2 |
Interconnect Type: | InfiniBand |
Software | |
---|---|
Adapter: | Mellanox MT25208 InfiniHost III Ex (PCIe x8 Gen1 2.5 GT/s) |
Adapter Driver: | OFED-1.4.2 |
Adapter Firmware: | 5.3.0 |
Operating System: | SUSE Linux Enterprise Server 11 SP1 (ia64) Kernel 2.6.32.12-0.7-default |
Local File System: | xfs |
Shared File System: | -- |
System State: | Run Level 3 (Multi-User) |
Other Software: | SGI ProPack 7SP1 for Linux, Build 701r2.sles11-1005242307 |
Hardware | |
---|---|
Vendor: | Mellanox Technologies |
Model: | None |
Switch Model: | Mellanox Infiniscale-IV |
Number of Switches: | 16 |
Number of Ports: | 36 |
Data Rate: | InfiniBand 4x QDR |
Firmware: | 5040005 |
Topology: | Enhanced HyperCube |
Primary Use: | MPI and I/O traffic |
The config file option 'submit' was used. For peak benchmarks that used 2048 MPI ranks, four ranks were assigned to each CPU die, leaving two cores per die idle.
Software environment: export MPI_REQUEST_MAX=65536 export MPI_TYPE_MAX=32768 export MPI_BUFS_THRESHOLD=1 ulimit -s unlimited BIOS settings: AMI BIOS version 1.0a Job Placement: In the base run, each MPI job is assigned to a topologically compact set of nodes, i.e. the minimal needed number of switches was used for each job: 1 switch for up to 192 ranks, 2 switches for 384 ranks, 4 switches for 768 ranks, 8 switches for 1536 ranks and 16 switches for 3072 ranks.
icc |
126.lammps: | icpc |
ifort |
icc ifort |
121.pop2: | -DSPEC_MPI_CASE_FLAG |
-O3 -xSSE2 -no-prec-div |
126.lammps: | -O3 -xSSE2 -no-prec-div -ansi-alias |
-O3 -xSSE2 -no-prec-div |
-O3 -xSSE2 -no-prec-div |
122.tachyon: | basepeak = yes |
125.RAxML: | basepeak = yes |
142.dmilc: | basepeak = yes |
126.lammps: | basepeak = yes |
129.tera_tf: | basepeak = yes |
137.lu: | -O3 -xSSE2 -no-prec-div |
143.dleslie: | Same as 137.lu |
145.lGemsFDTD: | Same as 137.lu |
121.pop2: | -O3 -xSSE2 -no-prec-div |
128.GAPgeofem: | Same as 121.pop2 |
132.zeusmp2: | Same as 121.pop2 |
147.l2wrf2: | basepeak = yes |