CPU2017 Result Flag Description

Base Optimization Flags

C benchmarks

- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qipa=noobject
- EXTRA_CFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -qalias=noansi
- EXTRA_CFLAGS
- -qalias=ansi | noansi :
  If ansi is specified, type-based aliasing is used during optimization, which restricts the lvalues that can be safely used to access a data object. The default is ansi for the xlc, xlC, and c89 commands. This option has no effect unless you also specify the -O option.
  qalias=std |nostd :
  Indicates whether the compilation units contain any non-standard aliasing. If so, specify nostd.
- -O5
- EXTRA_COPTIMIZE
- Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.
  -O5 is equivalent to the following flags :
  - -O4
  - -qipa=level=2
- Includes:
  - -O4
    - -O3
      
      -O2
      
      -O
      
      -qhot=level=0
    - -qipa=level=1
    - -qarch=auto
    - -qtune=auto
    - -qsimd=auto
  - -qipa=level=2
- -qarch=pwr9
- EXTRA_COPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- EXTRA_COPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.

C++ benchmarks

- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qipa=noobject
- EXTRA_CXXFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -O4
- EXTRA_CXXOPTIMIZE
- Perform optimizations for maximum performance. This includes interprocedural analysis on all of the objects presented on the "link" step.
  -O4 is equivalent to the following flags:
  - -O3
  - -qipa=level=1
  - -qarch=auto
  - -qtune=auto
  - -qsimd=auto
- Includes:
  - -O3
    - -O2
      
      -O
    - -qhot=level=0
  - -qipa=level=1
  - -qarch=auto
  - -qtune=auto
  - -qsimd=auto
- -qarch=pwr9
- EXTRA_CXXOPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- EXTRA_CXXOPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -ltcmalloc
- EXTRA_CXXOPTIMIZE
- Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.

Fortran benchmarks

- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qipa=noobject
- EXTRA_FFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -O4
- EXTRA_FOPTIMIZE
- Perform optimizations for maximum performance. This includes interprocedural analysis on all of the objects presented on the "link" step.
  -O4 is equivalent to the following flags:
  - -O3
  - -qipa=level=1
  - -qarch=auto
  - -qtune=auto
  - -qsimd=auto
- Includes:
  - -O3
    - -O2
      
      -O
    - -qhot=level=0
  - -qipa=level=1
  - -qarch=auto
  - -qtune=auto
  - -qsimd=auto
- -qarch=pwr9
- EXTRA_FOPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- EXTRA_FOPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -qprefetch=dscr=1
- EXTRA_FOPTIMIZE
- Inserts prefetch instructions automatically where there are opportunities to improve code performance.
  - -qprefetch=aggressive : Aggressively prefetch data.
  - -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
  Example : -qprefetch=dscr=42

Peak Optimization Flags

C benchmarks

500.perlbench_r

- -Wl,-q
- LDFLAGS
- Pass the -q flag to the linker causing the final executable to have the relocation information.
- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qpdf1
- PASS1_CFLAGS, PASS1_LDFLAGS
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- PASS2_CFLAGS, PASS2_LDFLAGS
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -O3
- OPTIMIZE
- -O3 Performs additional optimizations that are memory intensive, compile-time intensive, and may change the semantics of the program slightly, unless -qstrict is specified. We recommend these optimizations when the desire for run-time speed improvements outweighs the concern for limiting compile-time resources.
  The optimizations provided include:
  - In-depth memory access analysis
  - Better loop scheduling
  - High-order loop analysis and transformations (-qhot=level=0)
  - Inlining of small procedures within a compilation unit by default
  - Eliminating implicit compile-time memory usage limits
  - Widening, which merges adjacent load/stores and other operations
  - Pointer aliasing improvements to enhance other optimizations
  -O3 is equivalent to the following flags :
  - -O2
  - -qhot=level=0
- Includes:
  - -O2
    - -O
      
      -O2
  - -qhot=level=0
- -qarch=pwr9
- OPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- OPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -qinline=level=10
- OPTIMIZE
- The inline option specifies the threshold and limit of inlined functions. Example : -qinline=40.
- -qprefetch=dscr=1
- OPTIMIZE
- Inserts prefetch instructions automatically where there are opportunities to improve code performance.
  - -qprefetch=aggressive : Aggressively prefetch data.
  - -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
  Example : -qprefetch=dscr=42
- -qipa=noobject
- EXTRA_CFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -qalias=noansi
- EXTRA_CFLAGS
- -qalias=ansi | noansi :
  If ansi is specified, type-based aliasing is used during optimization, which restricts the lvalues that can be safely used to access a data object. The default is ansi for the xlc, xlC, and c89 commands. This option has no effect unless you also specify the -O option.
  qalias=std |nostd :
  Indicates whether the compilation units contain any non-standard aliasing. If so, specify nostd.
- -qstrict=nans
- EXTRA_CFLAGS
- Disables transformations that may produce incorrect results in the presence of, or that may incorrectly produce IEEE floating-point NaN (not-a-number) values.
- -qfdpr
- EXTRA_COPTIMIZE
- The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.

502.gcc_r

- -Wl,-q
- LDFLAGS
- Pass the -q flag to the linker causing the final executable to have the relocation information.
- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qpdf1
- PASS1_CFLAGS, PASS1_LDFLAGS
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- PASS2_CFLAGS, PASS2_LDFLAGS
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -O3
- OPTIMIZE
- -O3 Performs additional optimizations that are memory intensive, compile-time intensive, and may change the semantics of the program slightly, unless -qstrict is specified. We recommend these optimizations when the desire for run-time speed improvements outweighs the concern for limiting compile-time resources.
  The optimizations provided include:
  - In-depth memory access analysis
  - Better loop scheduling
  - High-order loop analysis and transformations (-qhot=level=0)
  - Inlining of small procedures within a compilation unit by default
  - Eliminating implicit compile-time memory usage limits
  - Widening, which merges adjacent load/stores and other operations
  - Pointer aliasing improvements to enhance other optimizations
  -O3 is equivalent to the following flags :
  - -O2
  - -qhot=level=0
- Includes:
  - -O2
    - -O
      
      -O2
  - -qhot=level=0
- -qarch=pwr9
- OPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- OPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -ltcmalloc
- OPTIMIZE
- Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
- -Wl,-z,muldefs
- LDOPTIMIZE
- Instructs the linker to allow multiple definitions and the first definition will be used. Normally when a symbol is defined multiple times, the linker will report a fatal error.
- -qipa=noobject
- EXTRA_CFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -qalias=noansi
- EXTRA_CFLAGS
- -qalias=ansi | noansi :
  If ansi is specified, type-based aliasing is used during optimization, which restricts the lvalues that can be safely used to access a data object. The default is ansi for the xlc, xlC, and c89 commands. This option has no effect unless you also specify the -O option.
  qalias=std |nostd :
  Indicates whether the compilation units contain any non-standard aliasing. If so, specify nostd.
- -qfdpr
- EXTRA_COPTIMIZE
- The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.

505.mcf_r

- -Wl,-q
- LDFLAGS
- Pass the -q flag to the linker causing the final executable to have the relocation information.
- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qpdf1
- PASS1_CFLAGS, PASS1_LDFLAGS
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- PASS2_CFLAGS, PASS2_LDFLAGS
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -O5
- OPTIMIZE
- Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.
  -O5 is equivalent to the following flags :
  - -O4
  - -qipa=level=2
- Includes:
  - -O4
    - -O3
      
      -O2
      
      -O
      
      -qhot=level=0
    - -qipa=level=1
    - -qarch=auto
    - -qtune=auto
    - -qsimd=auto
  - -qipa=level=2
- -qarch=pwr9
- OPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- OPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -qdatasmall
- OPTIMIZE
- This option indicates to the compiler that each dynamic object allocated in the program fits within the size of 4GB.
- -qprefetch=dscr=4
- OPTIMIZE
- Inserts prefetch instructions automatically where there are opportunities to improve code performance.
  - -qprefetch=aggressive : Aggressively prefetch data.
  - -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
  Example : -qprefetch=dscr=42
- -ltcmalloc
- OPTIMIZE
- Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
- -qipa=noobject
- EXTRA_CFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -qfdpr
- EXTRA_COPTIMIZE
- The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.

525.x264_r

- -Wl,-q
- LDFLAGS
- Pass the -q flag to the linker causing the final executable to have the relocation information.
- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qpdf1
- PASS1_CFLAGS, PASS1_LDFLAGS
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- PASS2_CFLAGS, PASS2_LDFLAGS
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -O5
- OPTIMIZE
- Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.
  -O5 is equivalent to the following flags :
  - -O4
  - -qipa=level=2
- Includes:
  - -O4
    - -O3
      
      -O2
      
      -O
      
      -qhot=level=0
    - -qipa=level=1
    - -qarch=auto
    - -qtune=auto
    - -qsimd=auto
  - -qipa=level=2
- -qarch=pwr9
- OPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- OPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -qnounroll
- OPTIMIZE
- This flag is equivalent to -qunroll=no.
- -qrestrict
- OPTIMIZE
- Adds the restrict type qualifier to the pointer parameters within all functions without modifying the source file.
- -qprefetch=dscr=7
- OPTIMIZE
- Inserts prefetch instructions automatically where there are opportunities to improve code performance.
  - -qprefetch=aggressive : Aggressively prefetch data.
  - -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
  Example : -qprefetch=dscr=42
- -ltcmalloc
- OPTIMIZE
- Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
- -qipa=noobject
- EXTRA_CFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -qfdpr
- EXTRA_COPTIMIZE
- The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.

557.xz_r

- -Wl,-q
- LDFLAGS
- Pass the -q flag to the linker causing the final executable to have the relocation information.
- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qpdf1
- PASS1_CFLAGS, PASS1_LDFLAGS
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- PASS2_CFLAGS, PASS2_LDFLAGS
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -O5
- OPTIMIZE
- Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.
  -O5 is equivalent to the following flags :
  - -O4
  - -qipa=level=2
- Includes:
  - -O4
    - -O3
      
      -O2
      
      -O
      
      -qhot=level=0
    - -qipa=level=1
    - -qarch=auto
    - -qtune=auto
    - -qsimd=auto
  - -qipa=level=2
- -qarch=pwr9
- OPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- OPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -qpagesize=16M
- OPTIMIZE
- Asserts the minimum physical pagesize during program execution.
- -qprefetch=dscr=7
- OPTIMIZE
- Inserts prefetch instructions automatically where there are opportunities to improve code performance.
  - -qprefetch=aggressive : Aggressively prefetch data.
  - -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
  Example : -qprefetch=dscr=42
- -ltcmalloc
- OPTIMIZE
- Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
- -qipa=noobject
- EXTRA_CFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -qfdpr
- EXTRA_COPTIMIZE
- The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.

C++ benchmarks

520.omnetpp_r

- -Wl,-q
- LDFLAGS
- Pass the -q flag to the linker causing the final executable to have the relocation information.
- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qpdf1
- PASS1_CXXFLAGS, PASS1_LDFLAGS
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- PASS2_CXXFLAGS, PASS2_LDFLAGS
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -O5
- OPTIMIZE
- Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.
  -O5 is equivalent to the following flags :
  - -O4
  - -qipa=level=2
- Includes:
  - -O4
    - -O3
      
      -O2
      
      -O
      
      -qhot=level=0
    - -qipa=level=1
    - -qarch=auto
    - -qtune=auto
    - -qsimd=auto
  - -qipa=level=2
- -qarch=pwr9
- OPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- OPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -qlibansi
- OPTIMIZE
- Assumes that all functions with the name of an ANSI C defined library function are, in fact, the library functions.
- -qprefetch=dscr=1
- OPTIMIZE
- Inserts prefetch instructions automatically where there are opportunities to improve code performance.
  - -qprefetch=aggressive : Aggressively prefetch data.
  - -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
  Example : -qprefetch=dscr=42
- -ltcmalloc
- OPTIMIZE
- Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
- -qipa=noobject
- EXTRA_CXXFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -qfdpr
- EXTRA_CXXOPTIMIZE
- The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.

523.xalancbmk_r

- -Wl,-q
- LDFLAGS
- Pass the -q flag to the linker causing the final executable to have the relocation information.
- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qpdf1
- PASS1_CXXFLAGS, PASS1_LDFLAGS
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- PASS2_CXXFLAGS, PASS2_LDFLAGS
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -O5
- OPTIMIZE
- Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.
  -O5 is equivalent to the following flags :
  - -O4
  - -qipa=level=2
- Includes:
  - -O4
    - -O3
      
      -O2
      
      -O
      
      -qhot=level=0
    - -qipa=level=1
    - -qarch=auto
    - -qtune=auto
    - -qsimd=auto
  - -qipa=level=2
- -qarch=pwr9
- OPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- OPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -qpagesize=16M
- OPTIMIZE
- Asserts the minimum physical pagesize during program execution.
- -qprefetch=dscr=7
- OPTIMIZE
- Inserts prefetch instructions automatically where there are opportunities to improve code performance.
  - -qprefetch=aggressive : Aggressively prefetch data.
  - -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
  Example : -qprefetch=dscr=42
- -ltcmalloc
- OPTIMIZE
- Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
- -qipa=noobject
- EXTRA_CXXFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -qfdpr
- EXTRA_CXXOPTIMIZE
- The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.

531.deepsjeng_r

- -Wl,-q
- LDFLAGS
- Pass the -q flag to the linker causing the final executable to have the relocation information.
- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qpdf1
- PASS1_CXXFLAGS, PASS1_LDFLAGS
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- PASS2_CXXFLAGS, PASS2_LDFLAGS
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -O2
- OPTIMIZE
- -O2 performs a set of optimizations that are intended to offer improved performance without an unreasonable increase in time or storage that is required for compilation including :
  - Eliminates redundant code
  - Basic loop optimization
  - Can structure code to take advantage of -qarch and -qtune settings
- Includes:
  - -O
    - -O2
      
      -O
- -qarch=pwr9
- OPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- OPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -qipa
- OPTIMIZE
- Enhances optimization by doing detailed analysis across procedures (interprocedural analysis or IPA).
- -qrestrict
- OPTIMIZE
- Adds the restrict type qualifier to the pointer parameters within all functions without modifying the source file.
- -qprefetch=dscr=1
- OPTIMIZE
- Inserts prefetch instructions automatically where there are opportunities to improve code performance.
  - -qprefetch=aggressive : Aggressively prefetch data.
  - -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
  Example : -qprefetch=dscr=42
- -qipa=noobject
- EXTRA_CXXFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -qfdpr
- EXTRA_CXXOPTIMIZE
- The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.

541.leela_r

- -Wl,-q
- LDFLAGS
- Pass the -q flag to the linker causing the final executable to have the relocation information.
- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qpdf1
- PASS1_CXXFLAGS, PASS1_LDFLAGS
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- PASS2_CXXFLAGS, PASS2_LDFLAGS
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -O5
- OPTIMIZE
- Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.
  -O5 is equivalent to the following flags :
  - -O4
  - -qipa=level=2
- Includes:
  - -O4
    - -O3
      
      -O2
      
      -O
      
      -qhot=level=0
    - -qipa=level=1
    - -qarch=auto
    - -qtune=auto
    - -qsimd=auto
  - -qipa=level=2
- -qarch=pwr9
- OPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9
- OPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -qenum=small
- OPTIMIZE
- Tell the compiler that enum size is small.
- -funroll-all-loops
- OPTIMIZE
- Instructs the compiler to search for more opportunities for loop unrolling than that performed with -funroll-loops. In general, -funroll-all-loops has more chances to increase compile time or program size than -funroll-loops processing, but it might also improve your application's performance.
- -qinline=level=10
- OPTIMIZE
- The inline option specifies the threshold and limit of inlined functions. Example : -qinline=40.
- -qprefetch=dscr=6
- OPTIMIZE
- Inserts prefetch instructions automatically where there are opportunities to improve code performance.
  - -qprefetch=aggressive : Aggressively prefetch data.
  - -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
  Example : -qprefetch=dscr=42
- -qipa=noobject
- EXTRA_CXXFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
- -qfdpr
- EXTRA_CXXOPTIMIZE
- The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.

Fortran benchmarks

- -lhugetlbfs
- EXTRA_LDFLAGS
- Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
- -qpdf1
- PASS1_FFLAGS, PASS1_LDFLAGS
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- PASS2_FFLAGS, PASS2_LDFLAGS
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -O3
- OPTIMIZE
- -O3 Performs additional optimizations that are memory intensive, compile-time intensive, and may change the semantics of the program slightly, unless -qstrict is specified. We recommend these optimizations when the desire for run-time speed improvements outweighs the concern for limiting compile-time resources.
  The optimizations provided include:
  - In-depth memory access analysis
  - Better loop scheduling
  - High-order loop analysis and transformations (-qhot=level=0)
  - Inlining of small procedures within a compilation unit by default
  - Eliminating implicit compile-time memory usage limits
  - Widening, which merges adjacent load/stores and other operations
  - Pointer aliasing improvements to enhance other optimizations
  -O3 is equivalent to the following flags :
  - -O2
  - -qhot=level=0
- Includes:
  - -O2
    - -O
      
      -O2
  - -qhot=level=0
- -qarch=pwr9
- OPTIMIZE
- Produces object code containing instructions that will run on the specified processors. auto selects the processor the compile is being done on.
  Supported values for this flag are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
- -qtune=pwr9:smt4
- OPTIMIZE
- Specifies the system architecture for which the executable program is optimized. This includes instruction scheduling and cache setting. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
  The supported values for suboption are :
  - auto - Use the processor on which the program is compiled.
  - pwr9 - The POWER9 processor based systems.
  - pwr8 - The POWER8 processor based systems.
  - st - Optimizations are tuned for single-threaded execution.
  - smt2 - Optimizations are tuned for SMT2 execution mode.
  - smt4 - Optimizations are tuned for SMT4 execution mode.
  - smt8 - Optimizations are tuned for SMT8 execution mode.
- -qhot
- OPTIMIZE
- Performs high-order transformations on loops during optimization. Some example usages are: -qhot, -qhot=level=1, -qhot=simd, -qhot=novector
  
  The supported values for suboption are :
  - arraypad - The compiler will pad any arrays where it infers that there may be a benefit.
  - level=0 - The compiler performs a limited set of high-order loop transformations.
  - level=1 - The compiler performs its full set of high-order loop transformations.
  - simd - Replaces certain instruction sequences with vector instructions.
  - vector - Replaces certain instruction sequences with calls to the MASS library.
  Specifying -qhot without suboptions implies -qhot=nosimd, -qhot=noarraypad, -qhot=vector and -qhot=level=1. The -qhot option is also implied by -O4 and -O5 .
- -qsimd=noauto
- OPTIMIZE
- - -qsimd : enables the generation of vector instructions for processors that support them.
  - -qnosimd : disables the generation of vector instructions.
  - Default : whether -qsimd is specified or not, -qsimd=auto is implied at the -O3 or higher optimization level; -qsimd=noauto is implied at the -O2 or lower optimization level.
- -qsmallstack
- OPTIMIZE
- Reduces the size of the stack frame. Programs that allocate large amounts of data to the stack, such as threaded programs, may result in stack overflows. This option can reduce the size of the stack frame to help avoid overflows.
- -qprefetch=dscr=9
- OPTIMIZE
- Inserts prefetch instructions automatically where there are opportunities to improve code performance.
  - -qprefetch=aggressive : Aggressively prefetch data.
  - -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
  Example : -qprefetch=dscr=42
- -ltcmalloc
- OPTIMIZE
- Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
- -qipa=noobject
- EXTRA_FFLAGS
- Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.

Implicitly Included Flags

This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.

Commands and Options Used to Submit Benchmark Runs

Commands and Options Used for Feedback-Directed Optimization

fdprpro is a Feedback Directed Program Restructuring optimization tool that is available for the IBM POWER platform. It can be used optionally during FDO.

Shell, Environment, and Other Software Settings

Operating System Tuning Parameters

Firmware / BIOS / Microcode Settings

Power and Performance Mode is settable at the Advanced System Management menu that controls the trade-offs between power efficiency, frequency, and consistency. Four modes are available:

Idle Power Saver is an option that can be combined with Maximum Performance Mode, Dynamic Performance Mode, and Disable All Modes to allow the system to drop to a frequency level below nominal frequency under programmable idle circumstances.

Default selection is "Speculative execution controls to mitigate user-to-kernel and user-to-user side-channel attacks".

For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact info@spec.org
Copyright 2017-2018 Standard Performance Evaluation Corporation
Tested with SPEC CPU2017 v1.0.5.
Report generated on 2018-10-31 18:41:13 by SPEC CPU2017 flags formatter v5178.

	Indicates that the flag description came from the user flags file.
	Indicates that the flag description came from the suite-wide flags file.
	Indicates that the flag description came from a per-benchmark flags file.

CPU2017 Flag DescriptionIBM Corporation IBM Power E950 (3.4 - 3.8 GHz, 40 core, SLES)

Base Compiler Invocation

Peak Compiler Invocation

Base Portability Flags

Peak Portability Flags

Base Optimization Flags

Peak Optimization Flags

Base Other Flags

Peak Other Flags

Implicitly Included Flags

CPU2017 Flag Description
IBM Corporation IBM Power E950 (3.4 - 3.8 GHz, 40 core, SLES)