MPI2007 Result Flag Description

Base Optimization Flags

C benchmarks

- -O3
- mpicc,mpiCC,mpif90
- COPTIMIZE
- Enables O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations. Enables optimizations for maximum speed, such as:
  - Loop unrolling, including instruction scheduling
  - Code replication to eliminate branches
  - Padding the size of certain power-of-two arrays to allow more efficient cache use.
  On Intel Itanium processors, the O3 option enables optimizations for technical computing applications (loop-intensive code):
  loop optimizations and data prefetch. The O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to O2 optimizations.
  The O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.
- Includes:
  - -O2
    - -O1
      
      -unrolln
      
      -builtin
      
      -mno-ieee-fp
      
      -fomit-frame-pointer
      
      -ffunction-sections
- -ipo
- COPTIMIZE
- Multi-file ip optimizations that includes:
  - inline function expansion
  - interprocedural constant propogation
  - dead code elimination
  - propagation of function characteristics
  - passing arguments in registers
  - loop-invariant code motion
- -no-prec-div
- COPTIMIZE
- Enables optimizations that give slightly less precise results than full IEEE division. With some optimizations, such as -xN and -xB, the compiler may change floating-point division compu- tations into multiplication by the reciprocal of the denomina- tor. For example, A/B is computed as A * (1/B) to improve the speed of the computation. The default is -prec-div, which provides fully precise IEEE division. It improves precision of floating-point divides by disabling floating-point division-to-multiplication optimiza- tions, resulting in greater accuracy with some loss of perfor- mance.
- -axS
- COPTIMIZE
- Instructs the compiler to generate SSE4 Vectorizing Compiler and Media Accelerators instructions for future Intel processors that support the instructions, as well as generic IA-32 architecture code.

C++ benchmarks

126.lammps

- -O3
- mpicc,mpiCC,mpif90
- CXXOPTIMIZE
- Enables O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations. Enables optimizations for maximum speed, such as:
  - Loop unrolling, including instruction scheduling
  - Code replication to eliminate branches
  - Padding the size of certain power-of-two arrays to allow more efficient cache use.
  On Intel Itanium processors, the O3 option enables optimizations for technical computing applications (loop-intensive code):
  loop optimizations and data prefetch. The O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to O2 optimizations.
  The O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.
- Includes:
  - -O2
    - -O1
      
      -unrolln
      
      -builtin
      
      -mno-ieee-fp
      
      -fomit-frame-pointer
      
      -ffunction-sections
- -ipo
- CXXOPTIMIZE
- Multi-file ip optimizations that includes:
  - inline function expansion
  - interprocedural constant propogation
  - dead code elimination
  - propagation of function characteristics
  - passing arguments in registers
  - loop-invariant code motion
- -no-prec-div
- CXXOPTIMIZE
- Enables optimizations that give slightly less precise results than full IEEE division. With some optimizations, such as -xN and -xB, the compiler may change floating-point division compu- tations into multiplication by the reciprocal of the denomina- tor. For example, A/B is computed as A * (1/B) to improve the speed of the computation. The default is -prec-div, which provides fully precise IEEE division. It improves precision of floating-point divides by disabling floating-point division-to-multiplication optimiza- tions, resulting in greater accuracy with some loss of perfor- mance.
- -axS
- CXXOPTIMIZE
- Instructs the compiler to generate SSE4 Vectorizing Compiler and Media Accelerators instructions for future Intel processors that support the instructions, as well as generic IA-32 architecture code.

Fortran benchmarks

- -O3
- mpicc,mpiCC,mpif90
- FOPTIMIZE
- Enables O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations. Enables optimizations for maximum speed, such as:
  - Loop unrolling, including instruction scheduling
  - Code replication to eliminate branches
  - Padding the size of certain power-of-two arrays to allow more efficient cache use.
  On Intel Itanium processors, the O3 option enables optimizations for technical computing applications (loop-intensive code):
  loop optimizations and data prefetch. The O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to O2 optimizations.
  The O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.
- Includes:
  - -O2
    - -O1
      
      -unrolln
      
      -builtin
      
      -mno-ieee-fp
      
      -fomit-frame-pointer
      
      -ffunction-sections
- -ipo
- FOPTIMIZE
- Multi-file ip optimizations that includes:
  - inline function expansion
  - interprocedural constant propogation
  - dead code elimination
  - propagation of function characteristics
  - passing arguments in registers
  - loop-invariant code motion
- -no-prec-div
- FOPTIMIZE
- Enables optimizations that give slightly less precise results than full IEEE division. With some optimizations, such as -xN and -xB, the compiler may change floating-point division compu- tations into multiplication by the reciprocal of the denomina- tor. For example, A/B is computed as A * (1/B) to improve the speed of the computation. The default is -prec-div, which provides fully precise IEEE division. It improves precision of floating-point divides by disabling floating-point division-to-multiplication optimiza- tions, resulting in greater accuracy with some loss of perfor- mance.
- -axS
- FOPTIMIZE
- Instructs the compiler to generate SSE4 Vectorizing Compiler and Media Accelerators instructions for future Intel processors that support the instructions, as well as generic IA-32 architecture code.

Benchmarks using both Fortran and C

- -O3
- mpicc,mpiCC,mpif90
- COPTIMIZE, FOPTIMIZE
- Enables O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations. Enables optimizations for maximum speed, such as:
  - Loop unrolling, including instruction scheduling
  - Code replication to eliminate branches
  - Padding the size of certain power-of-two arrays to allow more efficient cache use.
  On Intel Itanium processors, the O3 option enables optimizations for technical computing applications (loop-intensive code):
  loop optimizations and data prefetch. The O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to O2 optimizations.
  The O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.
- Includes:
  - -O2
    - -O1
      
      -unrolln
      
      -builtin
      
      -mno-ieee-fp
      
      -fomit-frame-pointer
      
      -ffunction-sections
- -ipo
- COPTIMIZE, FOPTIMIZE
- Multi-file ip optimizations that includes:
  - inline function expansion
  - interprocedural constant propogation
  - dead code elimination
  - propagation of function characteristics
  - passing arguments in registers
  - loop-invariant code motion
- -no-prec-div
- COPTIMIZE, FOPTIMIZE
- Enables optimizations that give slightly less precise results than full IEEE division. With some optimizations, such as -xN and -xB, the compiler may change floating-point division compu- tations into multiplication by the reciprocal of the denomina- tor. For example, A/B is computed as A * (1/B) to improve the speed of the computation. The default is -prec-div, which provides fully precise IEEE division. It improves precision of floating-point divides by disabling floating-point division-to-multiplication optimiza- tions, resulting in greater accuracy with some loss of perfor- mance.
- -axS
- COPTIMIZE, FOPTIMIZE
- Instructs the compiler to generate SSE4 Vectorizing Compiler and Media Accelerators instructions for future Intel processors that support the instructions, as well as generic IA-32 architecture code.

Implicitly Included Flags

This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.

For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact webmaster@spec.org
Copyright 2006-2010 Standard Performance Evaluation Corporation
Tested with SPEC MPI2007 v1.1.
Report generated on Tue Jul 22 13:34:59 2014 by SPEC MPI2007 flags formatter v1445.

MPI2007 Flag Description
Dell Inc. PowerEdge M605, Gigabit Ethernet, HP-MPI 2.2.7, Intel 10.1 compilers

Test sponsored by Platform Computing Inc.

Compilers:

Base Compiler Invocation

C benchmarks

C++ benchmarks

126.lammps

Fortran benchmarks

Benchmarks using both Fortran and C

Base Portability Flags

121.pop2

127.wrf2

Base Optimization Flags

C benchmarks

C++ benchmarks

126.lammps

Fortran benchmarks

Benchmarks using both Fortran and C

Implicitly Included Flags

	Indicates that the flag description came from the user flags file.
	Indicates that the flag description came from the suite-wide flags file.
	Indicates that the flag description came from a per-benchmark flags file.

MPI2007 Flag DescriptionDell Inc. PowerEdge M605, Gigabit Ethernet, HP-MPI 2.2.7, Intel 10.1 compilers

Test sponsored by Platform Computing Inc.

Compilers:

Base Compiler Invocation

Base Portability Flags

Base Optimization Flags

Implicitly Included Flags

MPI2007 Flag Description
Dell Inc. PowerEdge M605, Gigabit Ethernet, HP-MPI 2.2.7, Intel 10.1 compilers