SPEC(R) MPIM2007 Summary AMD, QLogic Corporation, Rackable Systems, IWILL AMD Emerald Cluster: AMD Opteron CPUs, QLogic InfiniPath/SilverStorm Interconnect Sat May 26 13:43:00 2007 MPI2007 License: 0018 Test date: May-2007 Test sponsor: QLogic Corporation Hardware availability: Nov-2006 Tested by: QLogic Performance Engineering Software availability: Jul-2007 Base Base Base Peak Peak Peak Benchmarks Ranks Run Time Ratio Ranks Run Time Ratio -------------- ------ --------- --------- ------ --------- --------- 104.milc 32 625 2.51 S 32 625 2.51 S 104.milc 32 746 2.10 S 32 746 2.10 S 104.milc 32 733 2.13 * 32 733 2.13 * 107.leslie3d 32 2124 2.46 * 32 2004 2.61 S 107.leslie3d 32 2355 2.22 S 32 2075 2.52 S 107.leslie3d 32 2030 2.57 S 32 2044 2.55 * 113.GemsFDTD 32 1317 4.79 S 32 1317 4.79 S 113.GemsFDTD 32 1559 4.05 * 32 1559 4.05 * 113.GemsFDTD 32 1659 3.80 S 32 1659 3.80 S 115.fds4 32 836 2.33 S 32 836 2.33 S 115.fds4 32 891 2.19 S 32 891 2.19 S 115.fds4 32 843 2.31 * 32 843 2.31 * 121.pop2 32 1059 3.90 S 32 1059 3.90 S 121.pop2 32 1112 3.71 S 32 1112 3.71 S 121.pop2 32 1112 3.71 * 32 1112 3.71 * 122.tachyon 32 1445 1.94 * 32 1445 1.94 * 122.tachyon 32 1464 1.91 S 32 1464 1.91 S 122.tachyon 32 1330 2.10 S 32 1330 2.10 S 126.lammps 32 1005 2.90 S 32 1005 2.90 S 126.lammps 32 1032 2.82 * 32 1032 2.82 * 126.lammps 32 1033 2.82 S 32 1033 2.82 S 127.wrf2 32 1547 5.04 S 32 1547 5.04 S 127.wrf2 32 1552 5.02 * 32 1552 5.02 * 127.wrf2 32 1557 5.01 S 32 1557 5.01 S 128.GAPgeofem 32 570 3.62 S 32 570 3.62 S 128.GAPgeofem 32 560 3.69 * 32 560 3.69 * 128.GAPgeofem 32 547 3.77 S 32 547 3.77 S 129.tera_tf 32 946 2.93 S 32 942 2.94 * 129.tera_tf 32 945 2.93 * 32 942 2.94 S 129.tera_tf 32 945 2.93 S 32 939 2.95 S 130.socorro 32 867 4.40 S 32 535 7.13 S 130.socorro 32 846 4.51 S 32 490 7.79 S 130.socorro 32 865 4.41 * 32 509 7.50 * 132.zeusmp2 32 1186 2.62 S 32 1186 2.62 S 132.zeusmp2 32 1171 2.65 * 32 1171 2.65 * 132.zeusmp2 32 1155 2.69 S 32 1155 2.69 S 137.lu 32 2641 1.39 S 32 2641 1.39 S 137.lu 32 2619 1.40 * 32 2619 1.40 * 137.lu 32 2565 1.43 S 32 2565 1.43 S ============================================================================== 104.milc 32 733 2.13 * 32 733 2.13 * 107.leslie3d 32 2124 2.46 * 32 2044 2.55 * 113.GemsFDTD 32 1559 4.05 * 32 1559 4.05 * 115.fds4 32 843 2.31 * 32 843 2.31 * 121.pop2 32 1112 3.71 * 32 1112 3.71 * 122.tachyon 32 1445 1.94 * 32 1445 1.94 * 126.lammps 32 1032 2.82 * 32 1032 2.82 * 127.wrf2 32 1552 5.02 * 32 1552 5.02 * 128.GAPgeofem 32 560 3.69 * 32 560 3.69 * 129.tera_tf 32 945 2.93 * 32 942 2.94 * 130.socorro 32 865 4.41 * 32 509 7.50 * 132.zeusmp2 32 1171 2.65 * 32 1171 2.65 * 137.lu 32 2619 1.40 * 32 2619 1.40 * SPECmpiM_base2007 2.87 SPECmpiM_peak2007 3.00 BENCHMARK DETAILS ----------------- Type of System: Homogenous Total Compute Nodes: 8 Total Chips: 16 Total Cores: 32 Total Threads: 32 Total Memory: 64 GB Base Ranks Run: 32 Minimum Peak Ranks: 32 Maximum Peak Ranks: 32 C Compiler: QLogic PathScale C Compiler 3.0 C++ Compiler: QLogic PathScale C++ Compiler 3.0 Fortran Compiler: QLogic PathScale Fortran Compiler 3.0 Base Pointers: 64-bit Peak Pointers: 64-bit MPI Library: QLogic InfiniPath MPI 2.1 Other MPI Info: None Pre-processors: No Other Software: None Node Description: Rackable, IWILL, AMD ====================================== HARDWARE -------- Number of nodes: 8 Uses of the node: compute, head Vendor: Rackable Systems, IWILL, AMD Model: Rackable Systems C1000 chassis, IWILL DK8-HTX motherboard CPU Name: AMD Opteron 290 CPU(s) orderable: 1-2 chips Chips enabled: 2 Cores enabled: 4 Cores per chip: 2 Threads per core: 1 CPU Characteristics: -- CPU MHz: 2800 Primary Cache: 64 KB I + 64 KB D on chip per core Secondary Cache: 1 MB I+D on chip per core L3 Cache: None Other Cache: None Memory: 8 GB (8 x 1 GB DDR400) Disk Subsystem: 250 GB, SATA Other Hardware: Nodes custom-built by Rackable Systems. The Rackable C1000 chassis is half-depth with 450W, 48 VDC Power Supply. Integrated Gigabit Ethernet for admin/filesystem. Adapter: Intel 82541PI Gigabit Ethernet controller Number of Adapters: 1 Slot Type: integrated on motherboard Data Rate: 1 Gbps Ethernet Ports Used: 1 Interconnect Type: Ethernet Adapter: QLogic InfiniPath QHT7140 Number of Adapters: 1 Slot Type: HTX Data Rate: InfiniBand 4x SDR Ports Used: 1 Interconnect Type: InfiniBand SOFTWARE -------- Adapter: Intel 82541PI Gigabit Ethernet controller Adapter Driver: Part of Linux kernel modules Adapter Firmware: None Adapter: QLogic InfiniPath QHT7140 Adapter Driver: InfiniPath 2.1 Adapter Firmware: None Operating System: ClusterCorp Rocks 4.2.1 (Based on RedHat Enterprise Linux 4.0 Update 4) Local File System: Linux ext3 Shared File System: NFS System State: Multi-User Other Software: Sun Grid Engine 6.0 Node Description: Headnode NFS filesystem ========================================= HARDWARE -------- Number of nodes: 1 Uses of the node: file server, other Vendor: Tyan Model: Thunder K8QSD Pro (S4882) motherboard CPU Name: AMD Opteron 885 CPU(s) orderable: 1-4 chips Chips enabled: 4 Cores enabled: 8 Cores per chip: 2 Threads per core: 1 CPU Characteristics: -- CPU MHz: 2600 Primary Cache: 64 KB I + 64 KB D on chip per core Secondary Cache: 1 MB I+D on chip per core L3 Cache: None Other Cache: None Memory: 16 GB (16 x 1 GB DDR400 dimms) Disk Subsystem: 250 GB, SATA, 7200 RPM Other Hardware: None Adapter: Broadcom BCM5704C Number of Adapters: 2 Slot Type: integrated on motherboard Data Rate: 1 Gbps Ethernet Ports Used: 2 Interconnect Type: Ethernet SOFTWARE -------- Adapter: Broadcom BCM5704C Adapter Driver: Part of Linux kernel modules Adapter Firmware: None Operating System: ClusterCorp Rocks 4.2.1 (Based on RedHat Enterprise Linux 4.0 Update 4) Local File System: Linux ext3 Shared File System: NFS System State: Multi-User Other Software: Sun Grid Engine 6.0 General Notes ------------- "other" purposes of this node: login, compile, job submission and queuing. This node assembled with a 2U chassis and 700 watt ATX 12V Power Supply. Interconnect Description: QLogic InfiniBand HCAs and switches ============================================================= HARDWARE -------- Vendor: QLogic Model: InfiniPath and Silverstorm Switch Model: QLogic SilverStorm 9120 Fabric Director Number of Switches: 1 Number of Ports: 144 Data Rate: InfiniBand 4x SDR and InfiniBand 4x DDR Firmware: 3.4.0.5.2 Topology: Single switch (star) Primary Use: MPI traffic General Notes ------------- The data rate between InifniPath HCAs and SilverStorm switches is SDR. However, DDR is used for inter-switch links. Interconnect Description: Broadcom NICs, Force10 switches ========================================================= HARDWARE -------- Vendor: Force10 Model: E300 Switch Model: Force10 E300 Gig-E switch Number of Switches: 1 Number of Ports: 288 Data Rate: 1 Gbps Ethernet Firmware: N/A Topology: Single switch (star) Primary Use: file system traffic Compiler Invocation ------------------- C benchmarks: /usr/bin/mpicc -cc=pathcc C++ benchmarks: 126.lammps: /usr/bin/mpicxx -CC=pathCC Fortran benchmarks: /usr/bin/mpif90 -f90=pathf90 Benchmarks using both Fortran and C: /usr/bin/mpicc -cc=pathcc /usr/bin/mpif90 -f90=pathf90 Portability Flags ----------------- 104.milc: -DSPEC_MPI_LP64 115.fds4: -DSPEC_MPI_LC_TRAILING_DOUBLE_UNDERSCORE -DSPEC_MPI_LP64 121.pop2: -DSPEC_MPI_DOUBLE_UNDERSCORE -DSPEC_MPI_LP64 122.tachyon: -DSPEC_MPI_LP64 127.wrf2: -DF2CSTYLE -DSPEC_MPI_DOUBLE_UNDERSCORE -DSPEC_MPI_LINUX -DSPEC_MPI_LP64 128.GAPgeofem: -DSPEC_MPI_LP64 130.socorro: -fno-second-underscore -DSPEC_MPI_LP64 132.zeusmp2: -DSPEC_MPI_LP64 Base Optimization Flags ----------------------- C benchmarks: -march=opteron -Ofast -OPT:malloc_alg=1 C++ benchmarks: 126.lammps: -march=opteron -O3 -OPT:Ofast -CG:local_fwd_sched=on Fortran benchmarks: -march=opteron -O3 -OPT:Ofast -OPT:malloc_alg=1 -LANG:copyinout=off Benchmarks using both Fortran and C: -march=opteron -Ofast -OPT:malloc_alg=1 -O3 -OPT:Ofast -LANG:copyinout=off Peak Optimization Flags ----------------------- C benchmarks: 104.milc: basepeak = yes 122.tachyon: basepeak = yes C++ benchmarks: 126.lammps: basepeak = yes Fortran benchmarks: 107.leslie3d: -march=opteron -Ofast -OPT:unroll_size=256 113.GemsFDTD: basepeak = yes 129.tera_tf: -march=opteron -O3 -OPT:Ofast -OPT:malloc_alg=1 -OPT:unroll_size=256 137.lu: basepeak = yes Benchmarks using both Fortran and C: 115.fds4: basepeak = yes 121.pop2: basepeak = yes 127.wrf2: basepeak = yes 128.GAPgeofem: basepeak = yes 130.socorro: -march=opteron -Ofast -OPT:malloc_alg=1 -O3 -OPT:Ofast -LANG:copyinout=off -L/net/files/tools/acml/x86_64/acml3.5.0/pathscale64/lib -lacml 132.zeusmp2: basepeak = yes Other Flags ----------- C benchmarks: -IPA:max_jobs=4 C++ benchmarks: 126.lammps: -IPA:max_jobs=4 Fortran benchmarks: -IPA:max_jobs=4 Benchmarks using both Fortran and C: -IPA:max_jobs=4 The flags file that was used to format this result can be browsed at http://www.spec.org/mpi2007/flags/MPI2007_flags.20070717.01.html You can also download the XML flags source by saving the following link: http://www.spec.org/mpi2007/flags/MPI2007_flags.20070717.01.xml SPEC and SPEC MPI are registered trademarks of the Standard Performance Evaluation Corporation. All other brand and product names appearing in this result are trademarks or registered trademarks of their respective holders. ----------------------------------------------------------------------------- For questions about this result, please contact the tester. For other inquiries, please contact webmaster@spec.org. Copyright 2006-2010 Standard Performance Evaluation Corporation Tested with SPEC MPI2007 v60. Report generated on Tue Jul 22 13:32:32 2014 by MPI2007 ASCII formatter v1463. Originally published on 16 July 2007.