GNU Compiler Collection Flags

SPEC's modifications are:
Copyright (C) 2006-2017 Standard Performance Evaluation Corporation

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being "Funding Free Software", the Front-Cover Texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in your SPEC CPU kit at $SPEC/Docs/licenses/FDL.v1.3 and on the web at http://www.spec.org/cpu2017/Docs/licenses/FDL.v1.3. A copy of "Funding Free Software" is on your SPEC CPU kit at $SPEC/Docs/licenses/FundingFreeSW and on the web at http://www.spec.org/cpu2017/Docs/licenses/FundingFreeSW.

You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development.

Optimization Flags

-ffast-math
-fgnu89-inline
-fstack-arrays, -fno-stack-arrays
-fno-strict-aliasing
-fno-tree-loop-vectorize
-fno-unsafe-math-optimizations
-fopenmp
-fplugin=/path/to/plugin
-fprefetch-loop-arrays
-fprofile-generate
-fprofile-use
-fsigned-zeros
-ftree-parallelize-loops
-funroll-all-loops
-funroll-loops
-g
-L/path
-m32
-m64
-mabi=ilp32, -mabi=lp64
-march=core2, -march=athlon, -march=native...
-mavx
-mcpu=core2, -mcpu=niagara4, ...
-mfmaf
-mrecip, -mrecip=all, -mrecip=sqrt, ...
-msse2, -msse4.2...
-mtune=niagara4, -mtune=athlon...
-mvis3
-Ofast
-O1, -O2, -O3
-Osilent-gcc
-ffree-line-length-none
-fno-stack-protector
-static-libstdcxx
-std=c99-gcc
-std=c++14
-std=c++17
-std=c++03
-std=f2003
-Wl-dead_strip
-Wl,-rpath,/path/to/lib
-Wl,-stack_size,0xnnn
-z muldefs
-Wl,-z common-page-size=<n>

- -ffast-math
- (?:^|(?<=\s))-ffast-math(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Enables a range of optimizations that provide faster, though sometimes less precise, mathematical operations.
- -fgnu89-inline
- (?:^|(?<=\s))-fgnu89-inline(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Tells GCC to use the GNU semantics for "inline" functions, that is, the behavior prior to the C99 standard. This switch may resolve duplicate symbol errors, as noted in the 502.gcc_r benchmark description.
- -fstack-arrays, -fno-stack-arrays
- -f(?:no-)?stack-arrays
- Enabled: Put all local arrays, even those of unknown size onto stack memory.
  The -fno- form disables the behavior.
- -fno-strict-aliasing
- (?:^|(?<=\s))-fno-strict-aliasing(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- The language standards set aliasing requirements: programmers are expected to follow conventions so that the compiler can keep track of memory. If a program violates the requirements (for example, using pointer arithmetic), programs may crash, or (worse) wrong answers may be silently produced.
  
  Unfortunately, the aliasing requirements from the standards are not always well understood.
  
  Sometimes, the aliasing requirements are understood and nevertheless intentionally violated by smart programmers who know what they are doing, such as the programmer responsible for the inner workings of Perl storage allocation and variable handling.
  
  The -fno-strict-aliasing switch instructs the optimizer that it must not assume that the aliasing requirements from the standard are met by the current program. You will probably need it for 500.perlbench_r and 600.perlbench_s. Note that this is an optimization switch, not a portability switch. When running SPECint2017_rate_base or SPECint2017_speed_base, you must use the same optimization switches for all the C modules in base; see http://www.spec.org/cpu2017/Docs/runrules.html#BaseFlags and http://www.spec.org/cpu2017/Docs/runrules.html#MustValidate.
- -fno-tree-loop-vectorize
- (?:^|(?<=\s))-fno-tree-loop-vectorize(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- There are a group of GCC optimizations invoked via -ftree-vectorize and related flags, as described at https://gcc.gnu.org/projects/tree-ssa/vectorization.html. During testing of SPEC CPU2017, for some versions of GCC on some chips, some benchmarks did not get correct answers when the vectorizor was enabled. These problems were to isolate, and it is possible that later versions of the compiler might not encounter them.
  
  You can turn off loop vectorization with -fno-tree-loop-vectorize. Note that this is an optimization switch, not a portability switch. If it is needed, then in base you must use it consistently. See: http://www.spec.org/cpu2017/Docs/runrules.html#BaseFlags and http://www.spec.org/cpu2017/Docs/runrules.html#MustValidate.
- -fno-unsafe-math-optimizations
- (?:^|(?<=\s))-fno-unsafe-math-optimizations(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- The switch -funsafe-math-optimizations allows the compiler to make certain(*) aggressive assumptions, such as disregarding the programmer's intended order of operations. The run rules allow such re-ordering http://www.spec.org/cpu2017/Docs/runrules.html#reordering. The rules also point out that you must get answers that pass SPEC's validation requirements. In some cases, that will mean that some optimizations must be turned off.
  
  -fno-unsafe-math-optimizations turns off these(*) optimizations. You may need to use this flag in order to get certain benchmarks to validate. Note that this is an optimization switch, not a portability switch. If it is needed, then in base you will need to use it consistently. See: http://www.spec.org/cpu2017/Docs/runrules.html#BaseFlags and http://www.spec.org/cpu2017/Docs/runrules.html#MustValidate.
  
  (*) Much more detail about which optimizations is available.
- -fopenmp
- (?:^|(?<=\s))-fopenmp(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Yes
- Enable handling of OpenMP directives and generate parallel code.
- -fplugin=/path/to/plugin
- -fplugin=\S+(?=\s|$)
- Adds the plugin named. XXX this description needs to be expanded.
- -fprefetch-loop-arrays
- (?:^|(?<=\s))-fprefetch-loop-arrays(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Enables prefetching of arrays used in loops.
- -fprofile-generate
- (?:^|(?<=\s))-fprofile-generate(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Instruments code to collect information for profile-driven feedback. Information is collected regarding both code paths and data values.
- -fprofile-use
- (?:^|(?<=\s))-fprofile-use(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Applies information from a profile run in order to improve optimization. Several optimizations are improved when profile data is available, including branch probabilities, loop peeling, and loop unrolling.
- -fsigned-zeros
- (?:^|(?<=\s))-fsigned-zeros(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Disable optimizations for floating-point arithmetic that ignore the signedness of zero.
- -ftree-parallelize-loops
- -ftree-parallelize-loops=\d
- Yes
- Attempts to decompose loops in order to run them on multiple processors.
- -funroll-all-loops
- (?:^|(?<=\s))-funroll-all-loops(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Tells the optimizer to unroll all loops.
- -funroll-loops
- (?:^|(?<=\s))-funroll-loops(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Tells the optimizer to unroll loops whose number of iterations can be determined at compile time or upon entry to the loop.
- -g
- gcc,gfortran,gxx,mpicc,mpif90,mpicxx
- -g(?:\d)?
- Produce debugging information.
- -L/path
- -L(\S+)(?=\s|$)
- Add the specified path to the list of paths that the linker will search for archive libraries and control scripts.
- -m32
- gcc,gfortran,gxx,mpicc,mpif90,mpicxx
- -m32
- Compiles for a 32-bit (LP32) data model.
- -m64
- gcc,gfortran,gxx
- -m64
- Compiles for a 64-bit (LP64) data model.
- -mabi=ilp32, -mabi=lp64
- -mabi=(\S+)(?=\s|$)
- Generate code for ilp32 (int, long, pointer 32-bit) or lp64 (int 32-bit, longs and pointers 64-bit). With ilp32, int, long int and pointer are 32-bit; with lp64, int is 32-bit, but long int and pointer are 64-bit.
- -march=core2, -march=athlon, -march=native...
- -march=(\S+)(?=\s|$)
- On x86 systems, allows use of instructions that require the listed architecture.
- -mavx
- (?:^|(?<=\s))-mavx(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Generate code for processors that include the AVX extensions.
- -mcpu=core2, -mcpu=niagara4, ...
- -mcpu=(\S+)
- On SPARC systems, mcpu sets the available instruction set.
  On x86 systems, mcpu is a deprecated synonym for mtune.
- -mfmaf
- (?:^|(?<=\s))-mfmaf(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Generate code to take advantage of fused multiply-add

-mrecip, -mrecip=all, -mrecip=sqrt, ...
(?:^|(?<=\s))-mrecip(?:$[^$]+\))?(?:=\S*)?(?=\s|$)

       -mrecip
           This option enables use of "RCPSS" and "RSQRTSS" instructions (and
           their vectorized variants "RCPPS" and "RSQRTPS") with an additional
           Newton-Raphson step to increase precision instead of "DIVSS" and
           "SQRTSS" (and their vectorized variants) for single-precision
           floating-point arguments.  These instructions are generated only when
           -funsafe-math-optimizations is enabled together with
           -finite-math-only and -fno-trapping-math.

       -mrecip=opt
           This option controls which reciprocal estimate instructions may be
           used.  opt is a comma-separated list of options, which may be
           preceded by a ! to invert the option:

           all
               Enable all estimate instructions.

           default
               Enable the default instructions, equivalent to -mrecip.

           none
               Disable all estimate instructions, equivalent to -mno-recip.

           div Enable the approximation for scalar division.

           vec-div
               Enable the approximation for vectorized division.

           sqrt
               Enable the approximation for scalar square root.

           vec-sqrt
               Enable the approximation for vectorized square root.

           So, for example, -mrecip=all,!sqrt enables all of the reciprocal
           approximations, except for square root.

- -msse2, -msse4.2...
- -msse[\d.]+
- Allows use of instructions that require the SIMD units of the indicated type.
- -mtune=niagara4, -mtune=athlon...
- -mtune=(\S+)
- Tunes code based on the timing characteristics of the listed processor.
- -mvis3
- (?:^|(?<=\s))-mvis3(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Generate code to take advantage of version 3 of the SPARC Visual Instruction Set extensions
- -Ofast
- (?:^|(?<=\s))-Ofast(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Enable all optimizations of -O3 plus optimizations that are not valid for standard-compliant programs, such as re-ordering operations without regard to parentheses.
  Many more details are available.
- -O1, -O2, -O3
- gcc,gfortran,gxx,mpicc,mpif90,mpicxx
- -O\d\b
- Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed, such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and vectorization.
  Many more details are available.
- -Osilent-gcc
- gcc,gfortran,gxx,mpicc,mpif90,mpicxx
- -O\b
- Same as -O1
- -ffree-line-length-none
- -ffree-line-length-none
- Set column after which characters are ignored in typical fixed-form lines in the source file, and, unless -fno-pad-source, through which spaces are assumed (as if padded to that length) after the ends of short fixed-form lines. Popular values for n include 72 (the standard and the default), 80 (card image), and 132 (corresponding to "extended-source" options in some popular compilers). n may also be "none" meaning that the entire line is meaningful and that continued character constants never have implicit spaces appended to them to fill out the line. -ffixed-line-length-0 means the same thing as -ffixed-line-length-none.
- -fno-stack-protector
- -fno-stack-protector
- Disables "-fstack-protector" which emits extra code to check for buffer overflows, such as stack smashing attacks.
- -static-libstdcxx
- -static-libstdc\+\+
- Link the C++ library statically.
- -std=c99-gcc
- gcc,gfortran,gxx,mpicc,mpif90,mpicxx
- -std=c99
- Sets the language dialect to include syntax from the C99 standard, such as bool and other features used in CPU2017 benchmarks.
- -std=c++14
- gxx,mpicxx
- -std=c\+\+14
- Sets the language dialect to include syntax from the 1998 ISO C++ standard plus the 2003 technical corrigendum.
- -std=c++17
- gxx,mpicxx
- -std=c\+\+17
- Sets the language dialect to include syntax from the 1998 ISO C++ standard plus the 2003 technical corrigendum.
- -std=c++03
- gxx,mpicxx
- -std=c\+\+03
- Sets the language dialect to include syntax from the 1998 ISO C++ standard plus the 2003 technical corrigendum.
- -std=f2003
- gfortran,mpif90
- -std=f2003
- Sets the language dialect to include syntax from the Fortran 2003 standard.
- -Wl-dead_strip
- -Wl,-dead_strip
- Remove unused functions from the generated executable. Without this flag, on Mac OS X, you are likely to encounter duplicate symbols when linking 502.gcc_r or 602.gcc_s.
  
  Note that this is an optimization switch, not a portability switch. If it is needed, then in base you must use it consistently. See: http://www.spec.org/cpu2017/Docs/runrules.html#BaseFlags and http://www.spec.org/cpu2017/Docs/runrules.html#MustValidate.
- -Wl,-rpath,/path/to/lib
- -Wl,-rpath,(\S+)(?=\s|$)
- Add the specified directory to the runtime library search path used when linking an ELF executable with shared objects.
- -Wl,-stack_size,0xnnn
- -Wl,-stack_size,(?:0x)?[a-fA-F0-9]+\b
- Add the linker flag that requests a large stack. This flag is likely to be important only to one or two of the floating point speed benchmarks. In accordance with the rules for Base, it is set for all of fpspeed in base. See: http://www.spec.org/cpu2017/Docs/runrules.html#BaseFlags.
- -z muldefs
- (-z muldefs|-Wl,-z,muldefs)
- Allows links to proceed even if there are multiple definitions of some symbols. This switch may resolve duplicate symbol errors, as noted in the 502.gcc_r benchmark description.
- -Wl,-z common-page-size=<n>
- -Wl,-z common-page-size=(\S+)(?=\s|$)
- Set the requested page size for the program to one of the available sizes for your system - for example 2M, 4M, 1G.

- -D_FILE_OFFSET_BITS=64
- (?:^|(?<=\s))-D_FILE_OFFSET_BITS=64(?=\s|$)
- Ensure that there are no surprises if the benchmarks are run in an environment where file system metadata uses 64 bits.
- -fconvert=big-endian
- (?:^|(?<=\s))-fconvert=big-endian(?=\s|$)
- Use big-endian representation for unformatted files. This is important when reading 521.wrf_r, 621.wrf_s, and 628.pop2_s data files that were originally generated in big-endian format.
- -fno-underscoring
- (?:^|(?<=\s))-fno-underscoring(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Do not transform names of entities specified in the Fortran source file by appending underscores to them.
- -funsigned-char
- (?:^|(?<=\s))-funsigned-char(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Let the type "char" be unsigned, like "unsigned char".
  
  Note: this particular portability flag is included for 526.blender_r per the recommendation in its documentation - see http://www.spec.org/cpu2017/Docs/benchmarks/526.blender_r.html.

Compiler Flags

GCC_compiler_path_eater
gcc
gfortran
g++
mpicc
mpif90
g++

- GCC_compiler_path_eater
- \S+/(gcc|g\+\+|gfortran)(?=\s|$)
- gcc
- \bgcc(?=\s|$)
- Invokes the GNU C compiler.
- gfortran
- \bgfortran(?=\s|$)
- Invokes the GNU Fortran compiler.
- g++
- \bg\+\+(?=\s|$)
- Invokes the GNU C++ compiler.
- mpicc
- \bmpicc(?=\s|$)
- Invokes the MPI C driver using the GNU compiler.
- mpif90
- \bmpif90(?=\s|$)
- Invokes the MPI Fortran driver using the GNU compiler.
- g++
- \bmpicxx(?=\s|$)
- Invokes the MPI C++ driver using the GNU compiler.

Other Flags

-ffixed-form
-Wall
-Wno-return-type

- -ffixed-form
- (?:^|(?<=\s))-ffixed-form(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Allows source code in traditional (fixed-column) Fortran layout.
- -Wall
- (?:^|(?<=\s))-Wall(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Enables warnings.
- -Wno-return-type
- (?:^|(?<=\s))-Wno-return-type(?:$[^$]+\))?(?:=\S*)?(?=\s|$)
- Do not warn about functions defined with a return type that defaults to "int" or which return something other than what they were declared to.

Commands and Options Used to Submit Benchmark Runs

SPECrate runs might use one of these methods to bind processes to specific processors, depending on the config file.

Commands and Options Used for Feedback-Directed Optimization

No special commands are needed for feedback-directed optimization, other than the compiler profile flags.

Shell, Environment, and Other Software Settings

One or more of the following may have been used in the run. If so, it will be listed in the notes sections. Here is a brief guide to understanding them: