MPI2007 Result Flag Description

Base Portability Flags

104.milc

- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

115.fds4

- -DSPEC_MPI_LC_TRAILING_DOUBLE_UNDERSCORE
- CPORTABILITY
- SPEC_MPI_LC_TRAILING_DOUBLE_UNDERSCORE lowercase the name and append 2 underscores.
- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

121.pop2

- -DSPEC_MPI_DOUBLE_UNDERSCORE
- CPORTABILITY
- SPEC_MPI_DOUBLE_UNDERSCORE may be used in SPEC MPI2007.
- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

122.tachyon

- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

127.wrf2

- -DF2CSTYLE
- CPORTABILITY
- This flag should be used if C function names need to have two underscores appended in order for them to be callable by Fortran functions.
- -DSPEC_MPI_DOUBLE_UNDERSCORE
- CPORTABILITY
- SPEC_MPI_DOUBLE_UNDERSCORE may be used in SPEC MPI2007.
- -DSPEC_MPI_LINUX
- CPORTABILITY
- This macro indicates that the benchmark is being compiled on a Linux system.
- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

128.GAPgeofem

- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

130.socorro

- -fno-second-underscore
- FPORTABILITY
- CFP2006:
  
  If -funderscoring is in effect, and the original Fortran external identifier contained an underscore, -fsecond-underscore appends a second underscore to the one added by -funderscoring. -fno-second-underscore does not append a second underscore. The default is both -funderscoring and -fsecond-underscore, the same defaults as g77 uses. -fno-second-underscore corresponds to the default policies of PGI Fortran and Intel Fortran.
- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

132.zeusmp2

- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

Peak Portability Flags

104.milc

- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

115.fds4

- -DSPEC_MPI_LC_TRAILING_DOUBLE_UNDERSCORE
- CPORTABILITY
- SPEC_MPI_LC_TRAILING_DOUBLE_UNDERSCORE lowercase the name and append 2 underscores.
- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

121.pop2

- -DSPEC_MPI_DOUBLE_UNDERSCORE
- CPORTABILITY
- SPEC_MPI_DOUBLE_UNDERSCORE may be used in SPEC MPI2007.
- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

122.tachyon

- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

127.wrf2

- -DF2CSTYLE
- CPORTABILITY
- This flag should be used if C function names need to have two underscores appended in order for them to be callable by Fortran functions.
- -DSPEC_MPI_DOUBLE_UNDERSCORE
- CPORTABILITY
- SPEC_MPI_DOUBLE_UNDERSCORE may be used in SPEC MPI2007.
- -DSPEC_MPI_LINUX
- CPORTABILITY
- This macro indicates that the benchmark is being compiled on a Linux system.
- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

128.GAPgeofem

- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

130.socorro

- -fno-second-underscore
- FPORTABILITY
- CFP2006:
  
  If -funderscoring is in effect, and the original Fortran external identifier contained an underscore, -fsecond-underscore appends a second underscore to the one added by -funderscoring. -fno-second-underscore does not append a second underscore. The default is both -funderscoring and -fsecond-underscore, the same defaults as g77 uses. -fno-second-underscore corresponds to the default policies of PGI Fortran and Intel Fortran.
- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

132.zeusmp2

- -DSPEC_MPI_LP64
- EXTRA_CPORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks use this macro.

Base Optimization Flags

C benchmarks

- -march=opteron
- CC, LD
- Compiler will optimize code for selected platform. The default value, auto, means to optimize for the platform on which the compiler is running, as determined by reading /proc/cpuinfo. anyx86 means a generic 32-bit x86 processor without SSE2 support.
- -Ofast
- COPTIMIZE
- Equivalent to -O3 -ipa -OPT:Ofast -fno-math-errno -ffast-math.
  Use optimizations selected to maximize performance. Although the optimizations are generally safe, they may affect floating point accuracy due to rearrangement of computations.
  
  NOTE: -Ofast enables -ipa (inter-procedural analysis), which places limitations on how libraries and .o files are built.
- Includes:
  - -O3
  - -ipa
  - -OPT:Ofast
  - -fno-math-errno
  - -ffast-math
- -OPT:malloc_alg=1
- COPTIMIZE
- -OPT:malloc_alg=(0|1)
  Select an alternate malloc algorithm which may improve speed. The compiler adds setup code in the C/C++/Fortran "main" function to enable the chosen algorithm. The default is 0.

C++ benchmarks

126.lammps

- -march=opteron
- CXX, LD
- Compiler will optimize code for selected platform. The default value, auto, means to optimize for the platform on which the compiler is running, as determined by reading /proc/cpuinfo. anyx86 means a generic 32-bit x86 processor without SSE2 support.
- -O3
- CXXOPTIMIZE
- Specify the basic level of optimization desired.
  The options can be one of the following:
  
  0    Turn off all optimizations.
  
  1    Turn on local optimizations that can be done quickly. Do peephole optimizations and instruction scheduling.
  
  2    Turn on extensive optimization. This is the default.
  The optimizations at this level are generally conservative, in the sense that they are virtually always beneficial and avoid changes which affect such things as floating point accuracy. In addition to the level 1 optimizations, do inner loop unrolling, if-conversion, two passes of instruction scheduling, global register allocation, dead store elimination, instruction scheduling across basic blocks, and partial redundancy elimination.
  
  3    Turn on aggressive optimization.
  The optimizations at this level are distinguished from -O2 by their aggressiveness, generally seeking highest-quality generated code even if it requires extensive compile time. They may include optimizations that are generally beneficial but may hurt performance.
  This includes but is not limited to turning on the Loop Nest Optimizer, -LNO:opt=1, and setting -OPT:roundoff=1:IEEE_arithmetic=2:Olimit=9000:reorg_common=ON.
  
  s    Specify that code size is to be given priority in tradeoffs with execution time.
  If no value is specified, 2 is assumed.
- -OPT:Ofast
- CXXOPTIMIZE
- -OPT:Ofast
  Use optimizations selected to maximize performance. Although the optimizations are generally safe, they may affect floating point accuracy due to rearrangement of computations. This effectively turns on the following optimizations: -OPT:ro=2:Olimit=0:div_split=ON:alias=typed.
- Includes:
- -CG:local_fwd_sched=on
- CXXOPTIMIZE
- -CG:local_fwd_sched : Change the instruction scheduling algorithm to work forward instead of backward for the instructions in each basic block. The default is OFF for 64-bit ABI, and ON for 32-bit ABI.

Fortran benchmarks

- -march=opteron
- FC, LD
- Compiler will optimize code for selected platform. The default value, auto, means to optimize for the platform on which the compiler is running, as determined by reading /proc/cpuinfo. anyx86 means a generic 32-bit x86 processor without SSE2 support.
- -O3
- FOPTIMIZE
- Specify the basic level of optimization desired.
  The options can be one of the following:
  
  0    Turn off all optimizations.
  
  1    Turn on local optimizations that can be done quickly. Do peephole optimizations and instruction scheduling.
  
  2    Turn on extensive optimization. This is the default.
  The optimizations at this level are generally conservative, in the sense that they are virtually always beneficial and avoid changes which affect such things as floating point accuracy. In addition to the level 1 optimizations, do inner loop unrolling, if-conversion, two passes of instruction scheduling, global register allocation, dead store elimination, instruction scheduling across basic blocks, and partial redundancy elimination.
  
  3    Turn on aggressive optimization.
  The optimizations at this level are distinguished from -O2 by their aggressiveness, generally seeking highest-quality generated code even if it requires extensive compile time. They may include optimizations that are generally beneficial but may hurt performance.
  This includes but is not limited to turning on the Loop Nest Optimizer, -LNO:opt=1, and setting -OPT:roundoff=1:IEEE_arithmetic=2:Olimit=9000:reorg_common=ON.
  
  s    Specify that code size is to be given priority in tradeoffs with execution time.
  If no value is specified, 2 is assumed.
- -OPT:Ofast
- FOPTIMIZE
- -OPT:Ofast
  Use optimizations selected to maximize performance. Although the optimizations are generally safe, they may affect floating point accuracy due to rearrangement of computations. This effectively turns on the following optimizations: -OPT:ro=2:Olimit=0:div_split=ON:alias=typed.
- Includes:
- -OPT:malloc_alg=1
- FOPTIMIZE
- -OPT:malloc_alg=(0|1)
  Select an alternate malloc algorithm which may improve speed. The compiler adds setup code in the C/C++/Fortran "main" function to enable the chosen algorithm. The default is 0.
- -LANG:copyinout=off
- FOPTIMIZE
- -LANG:copyinout : When an array section is passed as the actual argument in a call, the compiler sometimes copies the array section to a temporary array and passes the temporary array, thus promoting locality in the accesses to the array argu- ment. This optimization is relevant only to Fortran, and this flag controls the aggressiveness of this optimization. The default is ON for -O2 or higher and OFF otherwise.

Benchmarks using both Fortran and C

- -march=opteron
- CC, FC, LD
- Compiler will optimize code for selected platform. The default value, auto, means to optimize for the platform on which the compiler is running, as determined by reading /proc/cpuinfo. anyx86 means a generic 32-bit x86 processor without SSE2 support.
- -Ofast
- COPTIMIZE
- Equivalent to -O3 -ipa -OPT:Ofast -fno-math-errno -ffast-math.
  Use optimizations selected to maximize performance. Although the optimizations are generally safe, they may affect floating point accuracy due to rearrangement of computations.
  
  NOTE: -Ofast enables -ipa (inter-procedural analysis), which places limitations on how libraries and .o files are built.
- Includes:
  - -O3
  - -ipa
  - -OPT:Ofast
  - -fno-math-errno
  - -ffast-math
- -OPT:malloc_alg=1
- COPTIMIZE, FOPTIMIZE
- -OPT:malloc_alg=(0|1)
  Select an alternate malloc algorithm which may improve speed. The compiler adds setup code in the C/C++/Fortran "main" function to enable the chosen algorithm. The default is 0.
- -O3
- FOPTIMIZE
- Specify the basic level of optimization desired.
  The options can be one of the following:
  
  0    Turn off all optimizations.
  
  1    Turn on local optimizations that can be done quickly. Do peephole optimizations and instruction scheduling.
  
  2    Turn on extensive optimization. This is the default.
  The optimizations at this level are generally conservative, in the sense that they are virtually always beneficial and avoid changes which affect such things as floating point accuracy. In addition to the level 1 optimizations, do inner loop unrolling, if-conversion, two passes of instruction scheduling, global register allocation, dead store elimination, instruction scheduling across basic blocks, and partial redundancy elimination.
  
  3    Turn on aggressive optimization.
  The optimizations at this level are distinguished from -O2 by their aggressiveness, generally seeking highest-quality generated code even if it requires extensive compile time. They may include optimizations that are generally beneficial but may hurt performance.
  This includes but is not limited to turning on the Loop Nest Optimizer, -LNO:opt=1, and setting -OPT:roundoff=1:IEEE_arithmetic=2:Olimit=9000:reorg_common=ON.
  
  s    Specify that code size is to be given priority in tradeoffs with execution time.
  If no value is specified, 2 is assumed.
- -OPT:Ofast
- FOPTIMIZE
- -OPT:Ofast
  Use optimizations selected to maximize performance. Although the optimizations are generally safe, they may affect floating point accuracy due to rearrangement of computations. This effectively turns on the following optimizations: -OPT:ro=2:Olimit=0:div_split=ON:alias=typed.
- Includes:
- -LANG:copyinout=off
- FOPTIMIZE
- -LANG:copyinout : When an array section is passed as the actual argument in a call, the compiler sometimes copies the array section to a temporary array and passes the temporary array, thus promoting locality in the accesses to the array argu- ment. This optimization is relevant only to Fortran, and this flag controls the aggressiveness of this optimization. The default is ON for -O2 or higher and OFF otherwise.

Peak Optimization Flags

Fortran benchmarks

107.leslie3d

- -march=opteron
- FC, LD
- Compiler will optimize code for selected platform. The default value, auto, means to optimize for the platform on which the compiler is running, as determined by reading /proc/cpuinfo. anyx86 means a generic 32-bit x86 processor without SSE2 support.
- -Ofast
- FOPTIMIZE
- Equivalent to -O3 -ipa -OPT:Ofast -fno-math-errno -ffast-math.
  Use optimizations selected to maximize performance. Although the optimizations are generally safe, they may affect floating point accuracy due to rearrangement of computations.
  
  NOTE: -Ofast enables -ipa (inter-procedural analysis), which places limitations on how libraries and .o files are built.
- Includes:
  - -O3
  - -ipa
  - -OPT:Ofast
  - -fno-math-errno
  - -ffast-math
- -OPT:unroll_size=256
- FOPTIMIZE
- -OPT:unroll_size=N
  Set the ceiling of maximum number of instructions for an unrolled inner loop. If N=0, the ceiling is disregarded. The default is 40.

129.tera_tf

- -march=opteron
- FC, LD
- Compiler will optimize code for selected platform. The default value, auto, means to optimize for the platform on which the compiler is running, as determined by reading /proc/cpuinfo. anyx86 means a generic 32-bit x86 processor without SSE2 support.
- -O3
- FOPTIMIZE
- Specify the basic level of optimization desired.
  The options can be one of the following:
  
  0    Turn off all optimizations.
  
  1    Turn on local optimizations that can be done quickly. Do peephole optimizations and instruction scheduling.
  
  2    Turn on extensive optimization. This is the default.
  The optimizations at this level are generally conservative, in the sense that they are virtually always beneficial and avoid changes which affect such things as floating point accuracy. In addition to the level 1 optimizations, do inner loop unrolling, if-conversion, two passes of instruction scheduling, global register allocation, dead store elimination, instruction scheduling across basic blocks, and partial redundancy elimination.
  
  3    Turn on aggressive optimization.
  The optimizations at this level are distinguished from -O2 by their aggressiveness, generally seeking highest-quality generated code even if it requires extensive compile time. They may include optimizations that are generally beneficial but may hurt performance.
  This includes but is not limited to turning on the Loop Nest Optimizer, -LNO:opt=1, and setting -OPT:roundoff=1:IEEE_arithmetic=2:Olimit=9000:reorg_common=ON.
  
  s    Specify that code size is to be given priority in tradeoffs with execution time.
  If no value is specified, 2 is assumed.
- -OPT:Ofast
- FOPTIMIZE
- -OPT:Ofast
  Use optimizations selected to maximize performance. Although the optimizations are generally safe, they may affect floating point accuracy due to rearrangement of computations. This effectively turns on the following optimizations: -OPT:ro=2:Olimit=0:div_split=ON:alias=typed.
- Includes:
- -OPT:malloc_alg=1
- FOPTIMIZE
- -OPT:malloc_alg=(0|1)
  Select an alternate malloc algorithm which may improve speed. The compiler adds setup code in the C/C++/Fortran "main" function to enable the chosen algorithm. The default is 0.
- -OPT:unroll_size=256
- FOPTIMIZE
- -OPT:unroll_size=N
  Set the ceiling of maximum number of instructions for an unrolled inner loop. If N=0, the ceiling is disregarded. The default is 40.

Benchmarks using both Fortran and C

130.socorro

- -march=opteron
- CC, FC, LD
- Compiler will optimize code for selected platform. The default value, auto, means to optimize for the platform on which the compiler is running, as determined by reading /proc/cpuinfo. anyx86 means a generic 32-bit x86 processor without SSE2 support.
- -Ofast
- COPTIMIZE
- Equivalent to -O3 -ipa -OPT:Ofast -fno-math-errno -ffast-math.
  Use optimizations selected to maximize performance. Although the optimizations are generally safe, they may affect floating point accuracy due to rearrangement of computations.
  
  NOTE: -Ofast enables -ipa (inter-procedural analysis), which places limitations on how libraries and .o files are built.
- Includes:
  - -O3
  - -ipa
  - -OPT:Ofast
  - -fno-math-errno
  - -ffast-math
- -OPT:malloc_alg=1
- COPTIMIZE, FOPTIMIZE
- -OPT:malloc_alg=(0|1)
  Select an alternate malloc algorithm which may improve speed. The compiler adds setup code in the C/C++/Fortran "main" function to enable the chosen algorithm. The default is 0.
- -O3
- FOPTIMIZE
- Specify the basic level of optimization desired.
  The options can be one of the following:
  
  0    Turn off all optimizations.
  
  1    Turn on local optimizations that can be done quickly. Do peephole optimizations and instruction scheduling.
  
  2    Turn on extensive optimization. This is the default.
  The optimizations at this level are generally conservative, in the sense that they are virtually always beneficial and avoid changes which affect such things as floating point accuracy. In addition to the level 1 optimizations, do inner loop unrolling, if-conversion, two passes of instruction scheduling, global register allocation, dead store elimination, instruction scheduling across basic blocks, and partial redundancy elimination.
  
  3    Turn on aggressive optimization.
  The optimizations at this level are distinguished from -O2 by their aggressiveness, generally seeking highest-quality generated code even if it requires extensive compile time. They may include optimizations that are generally beneficial but may hurt performance.
  This includes but is not limited to turning on the Loop Nest Optimizer, -LNO:opt=1, and setting -OPT:roundoff=1:IEEE_arithmetic=2:Olimit=9000:reorg_common=ON.
  
  s    Specify that code size is to be given priority in tradeoffs with execution time.
  If no value is specified, 2 is assumed.
- -OPT:Ofast
- FOPTIMIZE
- -OPT:Ofast
  Use optimizations selected to maximize performance. Although the optimizations are generally safe, they may affect floating point accuracy due to rearrangement of computations. This effectively turns on the following optimizations: -OPT:ro=2:Olimit=0:div_split=ON:alias=typed.
- Includes:
- -LANG:copyinout=off
- FOPTIMIZE
- -LANG:copyinout : When an array section is passed as the actual argument in a call, the compiler sometimes copies the array section to a temporary array and passes the temporary array, thus promoting locality in the accesses to the array argu- ment. This optimization is relevant only to Fortran, and this flag controls the aggressiveness of this optimization. The default is ON for -O2 or higher and OFF otherwise.
- -L/net/files/tools/acml/x86_64/acml3.5.0/pathscale64/lib -lacml
- EXTRA_LIBS
- -L -lacml ,
  when used as an EXTRA_LIBS variable, results in linking with AMD Core Math Library (64-bit) library, compiled with the 64-bit PathScale Compiler for Linux. By setting "RM_SOURCES= specblas.F90 specbessel.c", the calls to LAPACK and BLAS functions in the rest of 130.socorro are resolved by optimized versions in ACML.

Other Flags

C benchmarks

- -IPA:max_jobs=4
- EXTRA_LDFLAGS
- -IPA:max_jobs=N : This option limits the maximum parallelism when invoking the compiler after IPA to (at most) N compilations running at once. The option can take the following values:
  
  0 = The parallelism chosen is equal to either the number of CPUs, the number of cores, or the number of hyperthreading units in the compiling system, whichever is greatest.
  
  1 = Disable parallelization during compilation (default)
  
  >1 = Specifically set the degree of parallelism

C++ benchmarks

126.lammps

- -IPA:max_jobs=4
- EXTRA_LDFLAGS
- -IPA:max_jobs=N : This option limits the maximum parallelism when invoking the compiler after IPA to (at most) N compilations running at once. The option can take the following values:
  
  0 = The parallelism chosen is equal to either the number of CPUs, the number of cores, or the number of hyperthreading units in the compiling system, whichever is greatest.
  
  1 = Disable parallelization during compilation (default)
  
  >1 = Specifically set the degree of parallelism

Fortran benchmarks

- -IPA:max_jobs=4
- EXTRA_LDFLAGS
- -IPA:max_jobs=N : This option limits the maximum parallelism when invoking the compiler after IPA to (at most) N compilations running at once. The option can take the following values:
  
  0 = The parallelism chosen is equal to either the number of CPUs, the number of cores, or the number of hyperthreading units in the compiling system, whichever is greatest.
  
  1 = Disable parallelization during compilation (default)
  
  >1 = Specifically set the degree of parallelism

Benchmarks using both Fortran and C

- -IPA:max_jobs=4
- EXTRA_LDFLAGS
- -IPA:max_jobs=N : This option limits the maximum parallelism when invoking the compiler after IPA to (at most) N compilations running at once. The option can take the following values:
  
  0 = The parallelism chosen is equal to either the number of CPUs, the number of cores, or the number of hyperthreading units in the compiling system, whichever is greatest.
  
  1 = Disable parallelization during compilation (default)
  
  >1 = Specifically set the degree of parallelism

Implicitly Included Flags

This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.

For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact webmaster@spec.org
Copyright 2006-2010 Standard Performance Evaluation Corporation
Tested with SPEC MPI2007 v60.
Report generated on Tue Jul 22 13:32:32 2014 by SPEC MPI2007 flags formatter v1445.

	Indicates that the flag description came from the user flags file.
	Indicates that the flag description came from the suite-wide flags file.
	Indicates that the flag description came from a per-benchmark flags file.

MPI2007 Flag DescriptionAMD, QLogic Corporation, Rackable Systems, IWILL AMD Emerald Cluster: AMD Opteron CPUs, QLogic InfiniPath/SilverStorm Interconnect

Test sponsored by QLogic Corporation

Compilers: QLogic PathScale Compiler Suite

Compiler Invocation

Base Portability Flags

Peak Portability Flags

Base Optimization Flags

Peak Optimization Flags

Other Flags

Implicitly Included Flags

MPI2007 Flag Description
AMD, QLogic Corporation, Rackable Systems, IWILL AMD Emerald Cluster: AMD Opteron CPUs, QLogic InfiniPath/SilverStorm Interconnect