SPEC® CFP2006 Result

Copyright 2006-2014 Standard Performance Evaluation Corporation

IBM Corporation

IBM Power 795 (4.0 GHz, 256 core, SLES)

CPU2006 license: 11 Test date: Aug-2010
Test sponsor: IBM Corporation Hardware Availability: Sep-2010
Tested by: IBM Corporation Software Availability: Aug-2010
Benchmark results graph
Hardware
CPU Name: POWER7
CPU Characteristics: Intelligent Energy Optimization
enabled, up to 4.14 GHz
CPU MHz: 4004
FPU: Integrated
CPU(s) enabled: 256 cores, 32 chips, 8 cores/chip, 4 threads/core
CPU(s) orderable: 32,64,96,128,160,192,224,256 cores
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 4 MB I+D on chip per core
Other Cache: None
Memory: 2 TB (256x8 GB) DDR3 1066 MHz
Disk Subsystem: 17x146.8 GB Raid0 SAS SFF 15K RPM
Other Hardware: None
Software
Operating System: SUSE Linux Enterprise Server 11 SP1
(ppc64), Kernel 2.6.32.12-0.7-ppc64
Compiler: IBM XL C/C++ for Linux, V11.1
IBM XL Fortran for Linux, V13.1
Auto Parallel: No
File System: xfs
System State: Run level 5 (multi-user)
Base Pointers: 32-bit
Peak Pointers: 32/64-bit
Other Software: -Post-Link Optimization for Linux on
POWER, Version 5.5.0-3
-MicroQuill SmartHeap 9
-Apache C++ Standard Library V4.2.1

Results Table

Benchmark Base Peak
Copies Seconds Ratio Seconds Ratio Seconds Ratio Copies Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
410.bwaves 1008 3232 4240 1194 11500 1224 11200 1008 1172 11700 1197 11400 1164 11800
416.gamess 1008 2281 8650 2278 8660 2284 8640 1024 2169 9250 2169 9240 2175 9220
433.milc 1008 787 11800 776 11900 779 11900 256 198 11900 199 11800 198 11900
434.zeusmp 1008 1035 8860 1036 8850 1035 8860 1008 1035 8860 1036 8850 1035 8860
435.gromacs 1008 1036 6950 1030 6990 1033 6970 1024 796 9180 792 9240 793 9210
436.cactusADM 1008 1169 10300 1154 10400 1168 10300 1008 1169 10300 1154 10400 1168 10300
437.leslie3d 1008 1252 7570 1243 7620 1252 7570 256 289 8340 288 8360 286 8410
444.namd 1008 736 11000 722 11200 721 11200 1024 721 11400 706 11600 703 11700
447.dealII 1008 689 16700 690 16700 688 16800 1024 545 21500 541 21700 538 21800
450.soplex 1008 1356 6200 1129 7440 1124 7480 1008 1387 6060 1123 7480 1119 7510
453.povray 1008 562 9540 557 9620 562 9550 1024 550 9900 531 10300 532 10200
454.calculix 1008 993 8380 993 8380 1023 8130 1024 989 8540 994 8500 1001 8440
459.GemsFDTD 1008 1813 5900 1698 6300 1699 6290 1008 1813 5900 1698 6300 1699 6290
465.tonto 1008 1610 6160 1614 6140 1612 6150 1024 1014 9940 870 11600 900 11200
470.lbm 1008 824 16800 817 16900 828 16700 1008 824 16800 817 16900 828 16700
481.wrf 1008 1400 8040 1406 8010 1405 8010 1024 1944 5880 1143 10000 1145 9990
482.sphinx3 1008 1909 10300 1891 10400 1901 10300 1008 1909 10300 1891 10400 1901 10300

Peak Tuning Notes

 fdpr binary optimization tool used for:
 433.milc 435.gromacs 436.cactusADM 450.soplex 482.sphinx3
      with options -O4 -nodp
 434.zeusmp
      with options -O4 -vrox -nodp
 437.leslie3d 444.namd
      with options -O3 -lu -1 -nodp -sdp 9
 465.tonto
      with options -O4

Submit Notes

The config file option 'submit' was used.
Benchmarks bound to a processor using numactl on the submit command.

Operating System Notes

ulimit -s (stack) set to 1048576.
Large pages reserved as follows by root user:
      echo 67584 > /proc/sys/vm/nr_overcommit_hugepages
The following environment varibles were set before the runspec command:
      export HUGETLB_VERBOSE=0
      export HUGETLB_MORECORE=yes
      export HUGETLB_ELFMAP=W
      export XLFRTEOPTS=intrinthds=1

General Notes


 447.dealII (peak): "apache_stdcxx_4_2_1" src.alt was used.

The Apache C++ Standard Library V4.2.1 was installed from
http://stdcxx.apache.org/download.html using:
    gmake BUILDTYPE=8d CONFIG=gcc.config

Base Compiler Invocation

C benchmarks:

 xlc   -qlanglvl=extc99 

C++ benchmarks:

 xlC 

Fortran benchmarks:

 xlf95 

Benchmarks using both Fortran and C:

 xlc   -qlanglvl=extc99   xlf95 

Base Portability Flags

410.bwaves:  -qfixed 
416.gamess:  -qfixed 
434.zeusmp:  -qfixed 
435.gromacs:  -qfixed   -qextname 
436.cactusADM:  -qfixed   -qextname 
437.leslie3d:  -qfixed 
454.calculix:  -qfixed   -qextname 
481.wrf:  -DNOUNDERSCORE 
482.sphinx3:  -qchars=signed 

Base Optimization Flags

C benchmarks:

 -O5   -qarch=pwr7   -qtune=pwr7   -lhugetlbfs 

C++ benchmarks:

 -O5   -qarch=pwr7   -qtune=pwr7   -qrtti   -lhugetlbfs 

Fortran benchmarks:

 -O5   -qarch=pwr7   -qtune=pwr7   -qsmallstack=dynlenonheap   -qalias=nostd   -B/usr/share/libhugetlbfs/   -tl   -Wl,--hugetlbfs-align 

Benchmarks using both Fortran and C:

 -O5   -qarch=pwr7   -qtune=pwr7   -qsmallstack=dynlenonheap   -qalias=nostd   -B/usr/share/libhugetlbfs/   -tl   -Wl,--hugetlbfs-align 

Base Other Flags

C benchmarks:

 -qipa=noobject   -qipa=threads 

C++ benchmarks:

 -qipa=noobject   -qipa=threads 

Fortran benchmarks:

 -qipa=noobject   -qipa=threads 

Benchmarks using both Fortran and C:

 -qipa=noobject   -qipa=threads 

Peak Compiler Invocation

C benchmarks:

 xlc   -qlanglvl=extc99 

C++ benchmarks:

 xlC 

Fortran benchmarks:

 xlf95 

Benchmarks using both Fortran and C:

 xlc   -qlanglvl=extc99   xlf95 

Peak Portability Flags

410.bwaves:  -qfixed 
416.gamess:  -qfixed 
434.zeusmp:  -qfixed 
435.gromacs:  -qfixed   -qextname 
436.cactusADM:  -qfixed   -qextname 
437.leslie3d:  -qfixed 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -qfixed   -qextname 
481.wrf:  -DNOUNDERSCORE 
482.sphinx3:  -qchars=signed 

Peak Optimization Flags

C benchmarks:

433.milc:  -Wl,-q   -O5   -qarch=pwr7   -qtune=pwr7   -lhugetlbfs 
470.lbm:  basepeak = yes 
482.sphinx3:  basepeak = yes 

C++ benchmarks:

444.namd:  -Wl,-q   -qpdf1(pass 1)   -qpdf2(pass 2)   -O5   -qarch=pwr7   -qtune=pwr7   -lhugetlbfs 
447.dealII:  -O4   -qarch=pwr7   -qtune=pwr7   -qrtti   -qcpp_stdinc=/root/stdcxx421/include/ansi:/root/stdcxx421/include   -lsmartheap   -lhugetlbfs   -L/root/stdcxx421/lib   -R/root/stdcxx421/lib   -lstd8d 
450.soplex:  -Wl,-q   -qpdf1(pass 1)   -qpdf2(pass 2)   -O3   -qtune=auto   -qarch=pwr5   -lhugetlbfs 
453.povray:  -Wl,-q   -qpdf1(pass 1)   -qpdf2(pass 2)   -O4   -qarch=pwr7   -qtune=pwr7   -qsimd   -q64   -lsmartheap64 

Fortran benchmarks:

410.bwaves:  -qpdf1(pass 1)   -qpdf2(pass 2)   -O4   -qarch=pwr7   -qtune=pwr7   -qsmallstack=dynlenonheap   -q64   -lhugetlbfs 
416.gamess:  -qpdf1(pass 1)   -qpdf2(pass 2)   -O5   -qarch=pwr7   -qtune=pwr7   -qalias=nostd   -lhugetlbfs 
434.zeusmp:  basepeak = yes 
437.leslie3d:  -Wl,-q   -O5   -qarch=pwr7   -qtune=pwr7   -q64   -B/usr/share/libhugetlbfs/   -tl   -Wl,--hugetlbfs-align 
459.GemsFDTD:  basepeak = yes 
465.tonto:  -Wl,-q   -qpdf1(pass 1)   -qpdf2(pass 2)   -O5   -qarch=pwr7   -qtune=pwr7   -qsimd   -lhugetlbfs 

Benchmarks using both Fortran and C:

435.gromacs:  -Wl,-q   -qpdf1(pass 1)   -qpdf2(pass 2)   -O4   -qarch=pwr7   -qtune=pwr7   -qsimd   -lhugetlbfs 
436.cactusADM:  basepeak = yes 
454.calculix:  -qpdf1(pass 1)   -qpdf2(pass 2)   -O5   -qarch=pwr7   -qtune=pwr7   -B/usr/share/libhugetlbfs/   -tl   -Wl,--hugetlbfs-align 
481.wrf:  -O3   -qarch=pwr7   -qtune=pwr7   -q64   -lhugetlbfs 

Peak Other Flags

C benchmarks:

 -qipa=noobject   -qipa=threads 

C++ benchmarks:

 -qipa=noobject   -qipa=threads 

Fortran benchmarks:

 -qipa=noobject   -qipa=threads 

Benchmarks using both Fortran and C:

 -qipa=noobject   -qipa=threads 

The flags file that was used to format this result can be browsed at
http://www.spec.org/cpu2006/flags/IBM-Linux-XL.20100901.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/cpu2006/flags/IBM-Linux-XL.20100901.xml.