SPEC(R) MPIM2007 Summary Dell, QLogic, ClusterVision, U. of Cambridge HPC Cluster Darwin, QLogic InfiniBand Interconnect Mon May 28 03:16:49 2007 MPI2007 License: 0018 Test date: May-2007 Test sponsor: QLogic Corporation Hardware availability: Jul-2006 Tested by: QLogic Performance Engineering Software availability: Feb-2007 Base Base Base Peak Peak Peak Benchmarks Ranks Run Time Ratio Ranks Run Time Ratio -------------- ------ --------- --------- ------ --------- --------- 104.milc 512 40.3 38.8 S 104.milc 512 40.0 39.1 S 104.milc 512 40.1 39.0 * 107.leslie3d 512 144 36.1 S 107.leslie3d 512 139 37.5 * 107.leslie3d 512 137 38.0 S 113.GemsFDTD 512 592 10.7 S 113.GemsFDTD 512 601 10.5 S 113.GemsFDTD 512 596 10.6 * 115.fds4 512 51.8 37.7 * 115.fds4 512 50.1 38.9 S 115.fds4 512 70.9 27.5 S 121.pop2 512 198 20.8 S 121.pop2 512 195 21.2 S 121.pop2 512 196 21.0 * 122.tachyon 512 65.9 42.4 S 122.tachyon 512 66.2 42.2 * 122.tachyon 512 66.8 41.9 S 126.lammps 512 221 13.2 S 126.lammps 512 221 13.2 S 126.lammps 512 221 13.2 * 127.wrf2 512 125 62.4 S 127.wrf2 512 125 62.3 * 127.wrf2 512 140 55.7 S 128.GAPgeofem 512 28.9 71.5 * 128.GAPgeofem 512 28.8 71.8 S 128.GAPgeofem 512 29.4 70.2 S 129.tera_tf 512 117 23.6 * 129.tera_tf 512 116 23.8 S 129.tera_tf 512 125 22.2 S 130.socorro 512 133 28.7 * 130.socorro 512 135 28.4 S 130.socorro 512 132 29.0 S 132.zeusmp2 512 77.3 40.1 S 132.zeusmp2 512 71.2 43.6 * 132.zeusmp2 512 70.4 44.1 S 137.lu 512 53.5 68.7 * 137.lu 512 52.8 69.6 S 137.lu 512 67.9 54.1 S ============================================================================== 104.milc 512 40.1 39.0 * 107.leslie3d 512 139 37.5 * 113.GemsFDTD 512 596 10.6 * 115.fds4 512 51.8 37.7 * 121.pop2 512 196 21.0 * 122.tachyon 512 66.2 42.2 * 126.lammps 512 221 13.2 * 127.wrf2 512 125 62.3 * 128.GAPgeofem 512 28.9 71.5 * 129.tera_tf 512 117 23.6 * 130.socorro 512 133 28.7 * 132.zeusmp2 512 71.2 43.6 * 137.lu 512 53.5 68.7 * SPECmpiM_base2007 33.3 SPECmpiM_peak2007 Not Run BENCHMARK DETAILS ----------------- Type of System: Homogeneous Total Compute Nodes: 128 Total Chips: 256 Total Cores: 512 Total Threads: 512 Total Memory: 1 TB Base Ranks Run: 512 Minimum Peak Ranks: -- Maximum Peak Ranks: -- C Compiler: QLogic PathScale C Compiler 3.0 C++ Compiler: QLogic PathScale C++ Compiler 3.0 Fortran Compiler: QLogic PathScale Fortran Compiler 3.0 Base Pointers: 64-bit Peak Pointers: 64-bit MPI Library: QLogic InfiniPath MPI 2.0 Other MPI Info: None Pre-processors: No Other Software: None Node Description: Dell PowerEdge 1950 ===================================== HARDWARE -------- Number of nodes: 128 Uses of the node: compute, head Vendor: Dell Model: Dell PowerEdge 1950 CPU Name: Intel Xeon 5160 CPU(s) orderable: 1-2 chips Chips enabled: 2 Cores enabled: 4 Cores per chip: 2 Threads per core: 1 CPU Characteristics: 1333 MHz system bus CPU MHz: 3000 Primary Cache: 32 KB I + 32 KB D on chip per core Secondary Cache: 4 MB I+D on chip per chip L3 Cache: None Other Cache: None Memory: 8 GB (8 x 1 GB PC2-5300F) Disk Subsystem: SAS, 73 GB, 15000 RPM Other Hardware: None Adapter: QLogic InfiniPath QLE7140 Number of Adapters: 1 Slot Type: PCIe x8 Data Rate: InfiniBand 4x SDR Ports Used: 1 Interconnect Type: InfiniBand SOFTWARE -------- Adapter: QLogic InfiniPath QLE7140 Adapter Driver: InfiniPath 2.0 Adapter Firmware: None Operating System: ClusterVisionOS 2.1 Based on Scientific Linux SL release 4.3 (Beryllium) Local File System: Linux/ext3 Shared File System: NFS System State: Multi-User Other Software: Torque 2.1.2 Node Description: Dell PowerVault MD1000 ======================================== HARDWARE -------- Number of nodes: 1 Uses of the node: file server Vendor: Dell Model: Dell PowerEdge 1950 CPU Name: Intel Xeon 5160 CPU(s) orderable: 1-2 chip Chips enabled: 2 Cores enabled: 4 Cores per chip: 2 Threads per core: 1 CPU Characteristics: 1333 MHz system bus CPU MHz: 3000 Primary Cache: 32 KB I + 32 KB D on chip per core Secondary Cache: 4 MB I+D on chip per chip L3 Cache: None Other Cache: None Memory: 4 GB (4 x 1 GB PC2-5300F) Disk Subsystem: 13.5 TB: 3 x 15 x 300 GB, SAS, 10000 RPM 3 Dell PowerVault MD1000 Disk Arrays, each one has 15 disks. Other Hardware: None Adapter: Chelsio T310 10GBASE-SR RNIC (rev 3) Number of Adapters: 1 Slot Type: PCIe x8 MSI-X Data Rate: 10 Gbps Ethernet Ports Used: 1 Interconnect Type: Ethernet SOFTWARE -------- Adapter: Chelsio T310 10GBASE-SR RNIC (rev 3) Adapter Driver: cxgb3 1.0.078 Adapter Firmware: T 3.3.0 Operating System: ClusterVisionOS 2.1 Based on Scientific Linux SL release 4.3 (Beryllium) Local File System: Linux/ext3 Shared File System: NFS System State: Multi-User Other Software: None General Notes ------------- A separate node handling login and resouces management is not listed as it is not performance related. Interconnect Description: QLogic InfiniBand HCAs and switches ============================================================= HARDWARE -------- Vendor: QLogic Model: InfiniPath adapters and Silverstorm switches Switch Model: QLogic SilverStorm 9080 Fabric Director (InfiniBand switch) Number of Switches: 2 Number of Ports: 96 Data Rate: InfiniBand 4x SDR and InfiniBand 4x DDR Firmware: 3.4.0.1.3 Switch Model: QLogic SilverStorm 9240 InfiniBand switch Number of Switches: 1 Number of Ports: 288 Data Rate: InfiniBand 4x SDR and InfiniBand 4x DDR Firmware: 3.4.0.1.3 Topology: Constant Bisectional Bandwidth, Fat-Tree, Max 5 switch-chip hops. Primary Use: MPI traffic General Notes ------------- Two CUs (Computational Unit, 65 nodes) were involved, so two SilverStorm 9080 switches and the 9240 core switch were used on this run. The data rate between InifniPath HCAs and SilverStorm switches is SDR. However, DDR is used for inter-switch links. Interconnect Description: Ethernet Network for File Server Access ================================================================= HARDWARE -------- Vendor: Chelsio, Nortel Model: Chelsio T310 adapters and Nortel 5530 5510 8610 switches Switch Model: Nortel Ethernet Routing Switch 5510-24T Number of Switches: 1 Number of Ports: 24 Data Rate: 1 Gbps Ethernet Firmware: 1.0.0.16 Switch Model: Nortel Ethernet Routing Switch 5510-48T Number of Switches: 3 Number of Ports: 48 Data Rate: 1 Gbps Ethernet Firmware: 1.0.0.16 Switch Model: Nortel Ethernet Routing Switch 5530-24TFD Number of Switches: 2 Number of Ports: 26 Data Rate: 1 Gbps Ethernet (24 ports) and 10 Gbps Ethernet (2 ports) Firmware: 4.2.0.12 Switch Model: Nortel Passport 8610 switch 4.1.0.0 Number of Switches: 1 Number of Ports: 24 Data Rate: 10 Gbps Ethernet Firmware: Optivity Switch Manager version 4.1 Topology: Three CUs are connected with six Ethernet Routing switches 5530-24TFD, 5510-24T and 5510-48T as a ring. Each of two 5530-24TFD switches is connected to the Nortel Passport 8610 switch through two 10Gbit ports. See Slide 10 of http://www.spec.org/mpi2007/results/supportingdocs/NortelEthernetSwitchDiagram.pdf for a network diagram. Primary Use: file system traffic Base Compiler Invocation ------------------------ C benchmarks: /usr/bin/mpicc -cc=pathcc C++ benchmarks: 126.lammps: /usr/bin/mpicxx -CC=pathCC Fortran benchmarks: 107.leslie3d: /usr/bin/mpif90 -f90=pathf90 113.GemsFDTD: /usr/bin/mpif90 -f90=pathf90 115.fds4: /usr/bin/mpif90 -f90=pathf90 129.tera_tf: /usr/bin/mpif90 -f90=pathf90 132.zeusmp2: /usr/bin/mpif90 -f90=pathf90 137.lu: /usr/bin/mpif90 -f90=pathf90 Benchmarks using both Fortran and C (except as noted below): /usr/bin/mpicc -cc=pathcc /usr/bin/mpif90 -f90=pathf90 Base Portability Flags ---------------------- 104.milc: -DSPEC_MPI_LP64 121.pop2: -DSPEC_MPI_DOUBLE_UNDERSCORE -DSPEC_MPI_LP64 122.tachyon: -DSPEC_MPI_LP64 127.wrf2: -DF2CSTYLE -DSPEC_MPI_DOUBLE_UNDERSCORE -DSPEC_MPI_LINUX -DSPEC_MPI_LP64 128.GAPgeofem: -DSPEC_MPI_LP64 130.socorro: -fno-second-underscore -DSPEC_MPI_LP64 Base Optimization Flags ----------------------- C benchmarks: -march=core -Ofast C++ benchmarks: 126.lammps: -march=core -O3 -OPT:Ofast -CG:local_fwd_sched=on Fortran benchmarks: 107.leslie3d: -march=core -O3 -OPT:Ofast -OPT:malloc_alg=1 -LANG:copyinout=off 113.GemsFDTD: -march=core -O3 -OPT:Ofast -OPT:malloc_alg=1 -LANG:copyinout=off 115.fds4: -march=core -O3 -OPT:Ofast -OPT:malloc_alg=1 -LANG:copyinout=off 129.tera_tf: -march=core -O3 -OPT:Ofast -OPT:malloc_alg=1 -LANG:copyinout=off 132.zeusmp2: -march=core -O3 -OPT:Ofast -OPT:malloc_alg=1 -LANG:copyinout=off 137.lu: -march=core -O3 -OPT:Ofast -OPT:malloc_alg=1 -LANG:copyinout=off Benchmarks using both Fortran and C: 121.pop2: -march=core -Ofast -O3 -OPT:Ofast -OPT:malloc_alg=1 -LANG:copyinout=off 127.wrf2: Same as 121.pop2 128.GAPgeofem: Same as 121.pop2 130.socorro: Same as 121.pop2 Base Other Flags ---------------- C benchmarks: -IPA:max_jobs=4 C++ benchmarks: 126.lammps: -IPA:max_jobs=4 Fortran benchmarks: 107.leslie3d: -IPA:max_jobs=4 113.GemsFDTD: -IPA:max_jobs=4 115.fds4: -IPA:max_jobs=4 129.tera_tf: -IPA:max_jobs=4 132.zeusmp2: -IPA:max_jobs=4 137.lu: -IPA:max_jobs=4 Benchmarks using both Fortran and C (except as noted below): -IPA:max_jobs=4 The flags file that was used to format this result can be browsed at http://www.spec.org/mpi2007/flags/MPI2007_flags.20070717.00.html You can also download the XML flags source by saving the following link: http://www.spec.org/mpi2007/flags/MPI2007_flags.20070717.00.xml SPEC and SPEC MPI are registered trademarks of the Standard Performance Evaluation Corporation. All other brand and product names appearing in this result are trademarks or registered trademarks of their respective holders. ----------------------------------------------------------------------------- For questions about this result, please contact the tester. For other inquiries, please contact webmaster@spec.org. Copyright 2006-2010 Standard Performance Evaluation Corporation Tested with SPEC MPI2007 v58. Report generated on Tue Jul 22 13:32:14 2014 by MPI2007 ASCII formatter v1463. Originally published on 16 July 2007.