CPU2006 Flag Description
IBM Corporation IBM Power 570 ( 4.2 GHz, 32 core, RedHat)

Compilers: IBM XL C/C++ for Linux V10.1 and XL Fortran for Linux V12.1

Operating systems: SUSE Linux Enterprise 10 and Red Hat Enterprise Linux Advanced Platform 5


Base Compiler Invocation

C benchmarks

C++ benchmarks


Peak Compiler Invocation

C benchmarks

C++ benchmarks


Base Portability Flags

400.perlbench

462.libquantum

464.h264ref

483.xalancbmk


Peak Portability Flags

400.perlbench

403.gcc

462.libquantum

464.h264ref

483.xalancbmk


Base Optimization Flags

C benchmarks

C++ benchmarks

    • -O5
    • [user]
    • CXXOPTIMIZE
    • Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 Provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.

      -O5 is equivalent to the following flags

      • -O4
      • -qipa=level=2
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • CXXOPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • CXXOPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qrtti
    • [user]
    • CXXOPTIMIZE
    • Cause the C++ compiler to generate Run Time Type Identification code for exception handling and for use by the typeid and dynamic_cast operators.

    • -lsmartheap
    • [user]
    • EXTRA_CXXLIBS
    • Link with MicroQuill's SmartHeap (32-bit) library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.


Peak Optimization Flags

C benchmarks

400.perlbench

    • -Wl,-q
    • [user]
    • LDCFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -qpdf1
    • [user]
    • PASS1_CFLAGS, PASS1_LDFLAGS
    • The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.

    • -qpdf2
    • [user]
    • PASS2_CFLAGS, PASS2_LDFLAGS
    • The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.

    • -O4
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes interprocedural analysis on all of the objects presented on the "link" step.

      -O4 is equivalent to the following flags

      • -O3
      • -qipa=level=1
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qalias=noansi
    • [user]
    • OPTIMIZE
    •  qalias=ansi | noansi
         If ansi is specified, type-based aliasing is
         used during optimization, which restricts the
         lvalues that can be safely used to access a
         data object. The default is ansi for the xlc,
         xlC, and c89 commands. This option has no
         effect unless you also specify the -O option.
      
       qalias=std |nostd
         Indicates whether the compilation units contain
         any non-standard aliasing (see Compiler Reference
         for more information). If so, specify nostd. 
      
    • -lsmartheap
    • [user]
    • EXTRA_LIBS
    • Link with MicroQuill's SmartHeap (32-bit) library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.

401.bzip2

    • -Wl,-q
    • [user]
    • LDCFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -qpdf1
    • [user]
    • PASS1_CFLAGS, PASS1_LDFLAGS
    • The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.

    • -qpdf2
    • [user]
    • PASS2_CFLAGS, PASS2_LDFLAGS
    • The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.

    • -O3
    • [user]
    • OPTIMIZE
    • Performs additional optimizations that are memory intensive, compile-time intensive, and may change the semantics of the program slightly, unless -qstrict is specified. We recommend these optimizations when the desire for run-time speed improvements outweighs the concern for limiting compile-time resources. The optimizations provided include:

      • In-depth memory access analysis
      • Better loop scheduling
      • High-order loop analysis and transformations (-qhot=level=0)
      • Inlining of small procedures within a compilation unit by default
      • Eliminating implicit compile-time memory usage limits
      • Widening, which merges adjacent load/stores and other operations
      • Pointer aliasing improvements to enhance other optimizations

      -O3 is equivalent to the following flags

      • -O2
      • -qhot=level=0

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -lhugetlbfs
    • [user]
    • EXTRA_LIBS
    • Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.

403.gcc

    • -Wl,-q
    • [user]
    • LDCFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -qpdf1
    • [user]
    • PASS1_CFLAGS, PASS1_LDFLAGS
    • The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.

    • -qpdf2
    • [user]
    • PASS2_CFLAGS, PASS2_LDFLAGS
    • The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.

    • -O4
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes interprocedural analysis on all of the objects presented on the "link" step.

      -O4 is equivalent to the following flags

      • -O3
      • -qipa=level=1
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qalloca
    • [user]
    • OPTIMIZE
    • Indicates that the compiler understands how to do alloca().

    • -q64
    • [user]
    • COPTIMIZE
    • Generates 64 bit ABI binaries. The default is to generate 32 bit binaries.

    • -lhugetlbfs
    • [user]
    • EXTRA_LIBS
    • Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.

429.mcf

    • -Wl,-q
    • [user]
    • LDCFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -O5
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 Provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.

      -O5 is equivalent to the following flags

      • -O4
      • -qipa=level=2
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qnoenablevmx
    • [user]
    • OPTIMIZE
    • Disables generation of vector instructions for processors that support them.

    • -lhugetlbfs
    • [user]
    • EXTRA_LIBS
    • Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.

445.gobmk

    • -Wl,-q
    • [user]
    • LDCFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -qpdf1
    • [user]
    • PASS1_CFLAGS, PASS1_LDFLAGS
    • The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.

    • -qpdf2
    • [user]
    • PASS2_CFLAGS, PASS2_LDFLAGS
    • The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.

    • -O4
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes interprocedural analysis on all of the objects presented on the "link" step.

      -O4 is equivalent to the following flags

      • -O3
      • -qipa=level=1
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qnoenablevmx
    • [user]
    • OPTIMIZE
    • Disables generation of vector instructions for processors that support them.

    • -lhugetlbfs
    • [user]
    • EXTRA_LIBS
    • Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.

456.hmmer

    • -Wl,-q
    • [user]
    • LDCFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -qpdf1
    • [user]
    • PASS1_CFLAGS, PASS1_LDFLAGS
    • The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.

    • -qpdf2
    • [user]
    • PASS2_CFLAGS, PASS2_LDFLAGS
    • The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.

    • -O5
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 Provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.

      -O5 is equivalent to the following flags

      • -O4
      • -qipa=level=2
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -lhugetlbfs
    • [user]
    • EXTRA_LIBS
    • Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.

458.sjeng

    • -Wl,-q
    • [user]
    • LDCFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -O5
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 Provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.

      -O5 is equivalent to the following flags

      • -O4
      • -qipa=level=2
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -lhugetlbfs
    • [user]
    • EXTRA_LIBS
    • Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.

462.libquantum

    • -Wl,-q
    • [user]
    • LDCFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -qpdf1
    • [user]
    • PASS1_CFLAGS, PASS1_LDFLAGS
    • The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.

    • -qpdf2
    • [user]
    • PASS2_CFLAGS, PASS2_LDFLAGS
    • The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.

    • -O5
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 Provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.

      -O5 is equivalent to the following flags

      • -O4
      • -qipa=level=2
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qnoenablevmx
    • [user]
    • OPTIMIZE
    • Disables generation of vector instructions for processors that support them.

    • -q64
    • [user]
    • COPTIMIZE
    • Generates 64 bit ABI binaries. The default is to generate 32 bit binaries.

    • -lhugetlbfs
    • [user]
    • EXTRA_LIBS
    • Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.

464.h264ref

    • -Wl,-q
    • [user]
    • LDCFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -qpdf1
    • [user]
    • PASS1_CFLAGS, PASS1_LDFLAGS
    • The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.

    • -qpdf2
    • [user]
    • PASS2_CFLAGS, PASS2_LDFLAGS
    • The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.

    • -O5
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 Provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.

      -O5 is equivalent to the following flags

      • -O4
      • -qipa=level=2
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -q64
    • [user]
    • COPTIMIZE
    • Generates 64 bit ABI binaries. The default is to generate 32 bit binaries.

    • -lhugetlbfs
    • [user]
    • EXTRA_LIBS
    • Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.

C++ benchmarks

471.omnetpp

    • -qpdf1
    • [user]
    • PASS1_CXXFLAGS, PASS1_LDFLAGS
    • The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.

    • -qpdf2
    • [user]
    • PASS2_CXXFLAGS, PASS2_LDFLAGS
    • The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.

    • -O4
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes interprocedural analysis on all of the objects presented on the "link" step.

      -O4 is equivalent to the following flags

      • -O3
      • -qipa=level=1
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qrtti
    • [user]
    • OPTIMIZE
    • Cause the C++ compiler to generate Run Time Type Identification code for exception handling and for use by the typeid and dynamic_cast operators.

    • -lsmartheap
    • [user]
    • EXTRA_LIBS
    • Link with MicroQuill's SmartHeap (32-bit) library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.

473.astar

    • -Wl,-q
    • [user]
    • LDCXXFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -qpdf1
    • [user]
    • PASS1_CXXFLAGS, PASS1_LDFLAGS
    • The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.

    • -qpdf2
    • [user]
    • PASS2_CXXFLAGS, PASS2_LDFLAGS
    • The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.

    • -O4
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes interprocedural analysis on all of the objects presented on the "link" step.

      -O4 is equivalent to the following flags

      • -O3
      • -qipa=level=1
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qnoenablevmx
    • [user]
    • OPTIMIZE
    • Disables generation of vector instructions for processors that support them.

    • -lsmartheap
    • [user]
    • EXTRA_LIBS
    • Link with MicroQuill's SmartHeap (32-bit) library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.

483.xalancbmk

    • -Wl,-q
    • [user]
    • LDCXXFLAGS
    • Pass the -q flag to the linker causing the final executable to have the relocation information.

    • -O5
    • [user]
    • OPTIMIZE
    • Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 Provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.

      -O5 is equivalent to the following flags

      • -O4
      • -qipa=level=2
      • -qarch=auto
      • -qtune=auto

    • Includes:
    • -qarch=pwr6
    • [user]
    • OPTIMIZE
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=pwr6
    • [user]
    • OPTIMIZE
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -lsmartheap
    • [user]
    • EXTRA_LIBS
    • Link with MicroQuill's SmartHeap (32-bit) library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.


Base Other Flags

C benchmarks

    • -qipa=noobject
    • [user]
    • EXTRA_CFLAGS
    • Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option does not affect the code in the final binary created.

    • -qipa=threads
    • [user]
    • EXTRA_LDFLAGS
    • The threads suboption allows the IPA optimizer to run portions of the optimization process in parallel threads, which can speed up the compilation process on multi-processor systems. All the available threads, or the number specified by N, may be used. N must be a positive integer. Specifying nothreads does not run any parallel threads; this is equivalent to running one serial thread. This option does not affect the code in the final binary created.

C++ benchmarks

    • -qipa=noobject
    • [user]
    • EXTRA_CXXFLAGS
    • Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option does not affect the code in the final binary created.

    • -qipa=threads
    • [user]
    • EXTRA_LDFLAGS
    • The threads suboption allows the IPA optimizer to run portions of the optimization process in parallel threads, which can speed up the compilation process on multi-processor systems. All the available threads, or the number specified by N, may be used. N must be a positive integer. Specifying nothreads does not run any parallel threads; this is equivalent to running one serial thread. This option does not affect the code in the final binary created.


Peak Other Flags

C benchmarks

    • -qipa=noobject
    • [user]
    • EXTRA_CFLAGS
    • Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option does not affect the code in the final binary created.

    • -qipa=threads
    • [user]
    • EXTRA_LDFLAGS
    • The threads suboption allows the IPA optimizer to run portions of the optimization process in parallel threads, which can speed up the compilation process on multi-processor systems. All the available threads, or the number specified by N, may be used. N must be a positive integer. Specifying nothreads does not run any parallel threads; this is equivalent to running one serial thread. This option does not affect the code in the final binary created.

C++ benchmarks

    • -qipa=noobject
    • [user]
    • EXTRA_CXXFLAGS
    • Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option does not affect the code in the final binary created.

    • -qipa=threads
    • [user]
    • EXTRA_LDFLAGS
    • The threads suboption allows the IPA optimizer to run portions of the optimization process in parallel threads, which can speed up the compilation process on multi-processor systems. All the available threads, or the number specified by N, may be used. N must be a positive integer. Specifying nothreads does not run any parallel threads; this is equivalent to running one serial thread. This option does not affect the code in the final binary created.


Implicitly Included Flags

This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.

    • -O3
    • [user]
    • Performs additional optimizations that are memory intensive, compile-time intensive, and may change the semantics of the program slightly, unless -qstrict is specified. We recommend these optimizations when the desire for run-time speed improvements outweighs the concern for limiting compile-time resources. The optimizations provided include:

      • In-depth memory access analysis
      • Better loop scheduling
      • High-order loop analysis and transformations (-qhot=level=0)
      • Inlining of small procedures within a compilation unit by default
      • Eliminating implicit compile-time memory usage limits
      • Widening, which merges adjacent load/stores and other operations
      • Pointer aliasing improvements to enhance other optimizations

      -O3 is equivalent to the following flags

      • -O2
      • -qhot=level=0

    • Includes:
    • -O2
    • [user]
    • Performs a set of optimizations that are intended to offer improved performance without an unreasonable increase in time or storage that is required for compilation including:

      • Eliminates redundant code
      • Basic loop optimization
      • Can structure code to take advantage of -qarch and -qtune settings

    • Includes:
    • -O
    • [user]
    • Enables the level of optimization that represents the best tradeoff between compilation speed and run-time performance. If you need a specific level of optimization, specify the appropriate numeric value. Currently, -O is equivalent to -O2.

    • Includes:
    • -qhot=level=0
    • [user]
    • Performs high-order transformations on loops during optimization.
          o arraypad
            The compiler will pad any arrays where it infers that there may be a benefit. 
          o level=0
            The compiler performs a limited set of high-order loop transformations. 
          o level=1
            The compiler performs its full set of high-order loop transformations. 
          o simd
            Replaces certain instruction sequences with vector instructions. 
          o vector
            Replaces certain instruction sequences with calls to the MASS library. 
      
      Specifying -qhot without suboptions implies -qhot=nosimd, -qhot=noarraypad, -qhot=vector and -qhot=level=1. The -qhot option is also implied by -O4, and -O5. 
      
    • -qipa=level=1
    • [user]
    • Enhances optimization by doing detailed analysis across procedures (interprocedural analysis or IPA). The level determines the amount of interprocedural analysis and optimization that is performed.

      level=0 Does only minimal interprocedural analysis and optimization

      level=1 turns on inlining , limited alias analysis, and limited call-site tailoring

      level=2 turns on full interprocedural data flow and alias analysis

    • -qarch=auto
    • [user]
    • Produces object code containing instructions that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.

      Supported values for this flag are

      • auto
      • Use the processor on which the program is compiled.
      • pwr6e
      • The POWER6 processor in "Enhanced" mode based systems.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qtune=auto
    • [user]
    • Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:

      • auto
      • Use the processor on which the program is compiled.
      • pwr6
      • The POWER6 processor based systems.
      • pwr5x
      • The POWER5+ processor based systems.
      • pwr5
      • The POWER5 processor based systems.
      • pwr4
      • The POWER4 processor based systems.
      • ppc970
      • The PPC970 processor based systems.
    • -qipa=level=2
    • [user]
    • Enhances optimization by doing detailed analysis across procedures (interprocedural analysis or IPA). The level determines the amount of interprocedural analysis and optimization that is performed.

      level=0 Does only minimal interprocedural analysis and optimization

      level=1 turns on inlining , limited alias analysis, and limited call-site tailoring

      level=2 turns on full interprocedural data flow and alias analysis


System and Other Tuning Information

  • ulimit -s unlimited

  • Sets the stack size to "unlimited" to allow the stack size to grow without limit.
  • To reserve 200 huge pages out of the physical memory pool, issue the following command
  • echo 200 > /proc/sys/vm/nr_hugepages
    
  • chsyscfg -m system -r prof -i name=profile,lpar_name=partition,lpar_proc_compat_mode=POWER6_enhanced
    This command enables the POWERPC architeture optional instructions supported on POWER6.
    Usage: chsyscfg -r lpar | prof | sys | sysprof | frame
                    -m <managed system> | -e <managed frame>
                    -f <configuration file> | -i "<configuration data>"
                    [--help]
    
    Changes partitions, partition profiles, system profiles, or the attributes of a
    managed system or a managed frame.
    
        -r                        - the type of resource(s) to be changed:
                                      lpar    - partition
                                      prof    - partition profile
                                      sys     - managed system
                                      sysprof - system profile
                                      frame   - managed frame
        -m <managed system>       - the managed system's name
        -e <managed frame>        - the managed frame's name
        -f <configuration file>   - the name of the file containing the
                                    configuration data for this command.
                                    The format is:
                                      attr_name1=value,attr_name2=value,...
                                    or
                                      "attr_name1=value1,value2,...",...
        -i "<configuration data>" - the configuration data for this command.
                                    The format is:
                                      "attr_name1=value,attr_name2=value,..."
                                    or
                                      ""attr_name1=value1,value2,...",..."
        --help                    - prints this help
    
    The valid attribute names for this command are:
        -r prof     required: name, lpar_id | lpar_name
                    optional: ...
                              lpar_proc_compat_mode (default | POWER6_enhanced)
    
  • Each process was bound to a cpu using submit= with the numactl command
  •  
    submit = numactl --membind=\$SPECCOPYNUM --physcpubind=\$SPECCOPYNUM $command
    
  • numactl : Control NUMA policy for processes or shared memory
         --membind=nodes
           Only  allocate  memory  from  nodes.   Allocation will fail when
           there is not enough memory available on these nodes.
    
        --physcpubind=cpus
           Only execute process on cpus.  This accepts physical cpu numbers
           as shown in the processor fields of /proc/cpuinfo.
    
  • Environment variables that can be set before the run:
  • HUGETLB_VERBOSE=0 : Turn off any debugging message from libhugetlbfs
    HUGETLB_MORECORE=yes:  Instructs libhugetlbfs to override libc's normal morecore() function with a hugepage version and use it for malloc(). 
    HUGETLB_MORECORE_HEAPBASE=0x50000000: Specifies that the hugepage heap address to start at 0x50000000. 
    XLFRTEOPTS=intrinthrds=1 : Causes the Fortran runtime to only use a single thread.
    
  • Post-Link Optimization (fdprpro):
  • 
          - First we copied the original executable (baseexe) to baseexe.orig. 
    
          - Then, the executable is instrumented and its initial profile generated, as follows: 
            $ fdprpro -a instr baseexe 
            The output will be generated (by default) in baseexe.instr and its profile in baseexe.nprof. 
    
          - Next, run baseexe.instr using the training data. This will fill the profile file with information that characterizes the training workload.
    
          - Finally, re-run FDPR-Pro with the profile file provided, as follows: 
            $ fdprpro -a opt -f baseexe.nprof [optimization options] baseexe 
    
          - We use the following optimization options :  -q -O4 -A 32 -shci 90 -sdp 9
    
          Optimization Options Descriptions:
    
           -A alignment, --align-code alignment
                Align program code so that hot code will be aligned on alignment-byte addresses.
    
           -abb factor, --align-basic-blocks factor
                Align basic blocks that are hotter then the average by given (float) factor. This is a lower-level
                machine-specific alignment compared to --align-code. Value of -1 (the default) disables this option.
    
           -bf, --branch-folding
                Eliminate branch to branch instructions.
    
           -bp, --branch-prediction
                Set branch prediction bit for conditional branches.
    
           -dce, --dead-code-elimination
                Eliminate instructions related to unused local variables within frequently executed functions (useful
                mainly after applying function inlining optimization).
    
           -dp, --data-prefetch
                Insert dcbt instructions to improve data-cache performance.
    
           -ece, --epilog-code-eliminate
                Reduce code size by grouping common instructions in functions' epilogs, into a single unified code.
    
           -hr, --hco-reschedule
                Relocate instructions from frequently executed code to rarely executed code areas, when possible.
    
           -hrf factor, --hco-resched-factor factor
                Set the aggressiveness of the -hr optimization option according to a factor value between (0,1), where
                0 is the least aggressive factor (applicable only with the -hr option).
    
           -i, --inline
                Same as --selective-inline with --inline-small-funcs 12.
    
            -ihf pct, --inline-hot-functions pct
                Inline all function call sites to functions that have a frequency count greater than the given pct
                frequency percentage.
    
           -isf size, --inline-small-funcs size
                Inline all functions that are smaller or equal to the given size in bytes.
    
           -kr, --killed-registers
                Eliminate stores and restores of registers that are killed (overwritten) after frequently executed
                function calls.
    
           -lap, --load-address-propagation
                Eliminate load instructions of variables' addresses by re-using pre-loaded addresses of adjacent vari-
                ables.
    
           -las, --load-after-store
                Add NOP instructions to place each load instruction further apart following a store instruction that
                reference the same memory address.
    
           -lro, --link-register-optimization
                Eliminate saves and restores of the link register in frequently-executed functions.
    
           -lu aggressiveness_factor, --loop-unroll aggressiveness_factor
                Unroll short loops containing of one to several basic blocks according to an aggressiveness factor
                between (1,9), where 1 is the least aggressive unrolling option for very hot and short loops.
    
           -lun unrolling_number, --loop-unrolling-number unrolling_number
                Set the number of unrolled iterations in each unrolled loop. The allowed range is between (2,50).
                Default is set to 2. (applicable only with the -lu flag).
    
           -nop, --nop-removal
                Remove NOP instructions from reordered code.
    
           -O   Switch on basic optimizations only. Same as -RC -nop -bp -bf.
    
           -O2  Switch on less aggressive optimization flags. Same as -O -hr -pto -isf 8 -tlo -kr.
    
           -O3  Switch on aggressive optimization flags. Same as -O2 -RD -isf 12 -si -dp -lro -las -vro -btcar -lu 9
                -rt 0 -pbsi.
    
          -O4  Switch on aggressive optimization flags together with aggressive function inlining. Same as -O3 -sidf
                50 -ihf 20 -sdp 9 -shci 90 and -bldcg (for XCOFF files).
    
           -O5  Switch on aggressive optimization flags together with HLR optimization. Same as -O4 -sa -gcpyp -gcnstp
                -dce.
    
           -pbsi, --path-based-selective-inline
                Perform selective inlining of dominant hot function calls based on control flow paths leading to hot
                functions.
    
           -pca, --propagate-constant-area
                Relocate the constant variables area to the top of the code section when possible.
    
           -[no]pr, --[no]ptrgl-r11
                Perform removal of R11 load instruction in _ptrgl csect.
    
           -pto, --ptrgl-optimization
                Perform optimization of indirect call instructions via registers by replacing them with conditional
                direct jumps.
    
           -ptosl limit_size, --ptrgl-optimization-size-limit limit_size
                Set the limit of the number of conditional statements generated by -pto optimization. Allowed values
                are between 1..100. Default value set to 3. (applicable only with the -pto flag).
    
           -ptoht heatness_threshold, --ptrgl-optimization-heatness-threshold heatness_threshold
                Set the frequency threshold for indirect calls that are to be optimized by -pto optimization. Allowed
                range between 0..1. Default is set to 0.8. (applicable only with -pto flag).
    
           -RC, --reorder-code
                Perform code reordering.
    
           -rcaf aggressiveness_factor, --reorder-code-aggressivenes-factor aggressiveness_factor
                Set the aggressiveness of code reordering optimization. Allowed values are [0 | 1 | 2], where 0 pre-
                serves original code order and 2 is the most aggressive. Default is set to 1. (applicable only with
                the -RC flag).
    
           -rcctf termination_factor, --reorder-code-chain-termination-factor termination_factor
                Set the threshold fraction which determines when to terminate each chain of basic blocks during code
                reordering. Allowed input range is between 0.0 to 1.0 where 0.0 generates long chains and 1.0 creates
                single basic block chains. Default is set to 0.05. (applicable only with the -RC flag).
    
           -rccrf reversal_factor, --reorder-code-condition-reversal-factor reversal_factor
                Set the threshold fraction which determines when to enable condition reversal for each conditional
                branch during code reordering. Allowed input range is between 0.0 to 1.0 when 0.0 tries to preserve
                original condition direction and 1.0 ignores it. Default is set to 0.8 (applicable only with the -RC
                flag).
    
           -RD, --reorder-data
                Perform static data reordering.
    
           -rmte, --remove-multiple-toc-entries
                Remove multiple TOC entries pointing to the same location in the input program file.
    
           -rt removal_factor, --reduce-toc removal_factor
                Perform removal of TOC entries according to a removal factor between (0,1), where 0 removes non-
                accessed TOC entries only, and 1 removes all possible TOC entries.
    
           -sdp aggressiveness_factor, --stride-data-prefetch aggressiveness_factor
                Perform data prefetching within frequently executed loops based on stride analysis, according to an
                aggressiveness factor between (1,9), where 1 is least aggressive.
    
           -sdpla iterations_number, --stride-data-prefetch-look-ahead iterations_number
                Set the number of iterations for which data is prefetched into the cache ahead of time. Default value
                is set to 4 iterations. (applicable only with the -sdp flag).
    
           -sdpms stride_min_size, --stride-data-prefetch-min-size stride_min_size
                Set the minimal stride size in bytes, for which data will be considered as a candidate for prefetch-
                ing. Default value is set to 128 bytes. (applicable only with the -sdp flag).
    
           -shci pct, --selective-hot-code-inline pct
                Perform selective inlining of functions in order to decrease the total number of execution counts, so
                that only functions whose hotness is above the given percentage are inlined.
    
           -si, --selective-inline
                Perform selective inlining of dominant hot function calls.
    
           -sll Lib1:Prof1,...,LibN:ProfN, --static-link-libraries Lib1:Prof1,...,LibN:ProfN
                Statically link hot code from specified dynamically linked libraries to the input program. The parame-
                ter consists of comma-separated list of libraries and their profiles. IMPORTANT: licensing rights of
                specified libraries should be observed when applying this copying optimization.
    
           -sllht hotness_threshold, --static-link-libraries-hotness-threshold hotness_threshold
                Set hotness threshold for the --static-link-libraries optimization. The allowed input range is between
                0 (least aggressive) to 1, or -1, which does not require profile and selects all code that might be
                called by the input program from the given libraries. Default is 0.5.
    
           -sidf percentage_factor, --selective-inline-dominant-factor percentage_factor
                Set a dominant factor percentage for selective inline optimization. The allowed range is between
                (0,100). Default is set to 80 (applicable only with the -si and -pbsi flags).
    
           -siht frequency_factor, --selective-inline-hotness-threshold frequency_factor
                Set a hotness threshold factor percentage for selective inline optimization to inline all dominant
                function calls that have a frequency count greater than the given frequency percentage. Default is set
                to 100 (applicable only with the -si -pbsi flags).
    
           -so, --stack-optimization
                Reduce the stack frame size of functions which are called with a small number of arguments.
    
           -tb, --preserve-traceback-tables
                Force the restructuring of traceback tables in reordered code. If -tb option is omitted, traceback
                tables are automatically included only for C++ applications which use the Try & Catch mechanism.
    
           -rtb, --remove-traceback-tables
                Remove traceback tables in reordered code.
    
           -tlo, --tocload-optimization
                Replace each load instruction that references the TOC with a corresponding add-immediate instruction
                via the TOC anchor register, when possible.
    
           -vro, --volatile-registers-optimization
                Eliminate stores and restores of non-volatile registers in frequently executed functions by using
                available volatile registers.
    
    

Flag description origin markings:

[user] Indicates that the flag description came from the user flags file.
[suite] Indicates that the flag description came from the suite-wide flags file.
[benchmark] Indicates that the flag description came from a per-benchmark flags file.

The flags file that was used to format this result can be browsed at
http://www.spec.org/cpu2006/flags/IBM-Linux-XL.20090713.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/cpu2006/flags/IBM-Linux-XL.20090713.xml.


For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact webmaster@spec.org
Copyright 2006-2014 Standard Performance Evaluation Corporation
Tested with SPEC CPU2006 v1.1.
Report generated on Tue Jul 22 20:31:32 2014 by SPEC CPU2006 flags formatter v6906.