Note: The GNU Compiler Collection provides a wide array of compiler options, described in detail and readily available at https://gcc.gnu.org/onlinedocs/gcc/Option-Index.html#Option-Index and https://gcc.gnu.org/onlinedocs/gfortran/. This SPEC CPU flags file contains excerpts from and brief summaries of portions of that documentation.
SPEC's modifications are:
Copyright (C) 2006-2020 Standard Performance Evaluation Corporation
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being "Funding Free Software", the Front-Cover Texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in your SPEC CPU kit at $SPEC/Docs/licenses/FDL.v1.3.txt and on the web at https://www.spec.org/cpu2017/Docs/licenses/FDL.v1.3.txt. A copy of "Funding Free Software" is on your SPEC CPU kit at $SPEC/Docs/licenses/FundingFreeSW.txt and on the web at https://www.spec.org/cpu2017/Docs/licenses/FundingFreeSW.txt.
(a) The FSF's Front-Cover Text is:
A GNU Manual
(b) The FSF's Back-Cover Text is:
You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development.
Invokes the GNU C compiler.
Invokes the GNU C++ compiler.
Invokes the GNU Fortran compiler.
This macro indicates that the benchmark is being compiled on an AMD64-compatible system running the Linux operating system.
This macro specifies that the target system uses the LP64 data model; specifically, that integers are 32 bits, while longs and pointers are 64 bits.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This flag can be set for SPEC compilation for LINUX using default compiler.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
Sets the language dialect to include syntax from the C99 standard, such as bool and other features used in CPU 2017 benchmarks.
Allows links to proceed even if there are multiple definitions of some symbols. This switch may resolve duplicate symbol errors, as noted in the 502.gcc_r benchmark description.
On systems that support dynamic linking, this overrides -pie and prevents linking with the shared libraries. On other systems, this option has no effect.
Assume that the current compilation unit represents the whole program being compiled. All public functions and variables with the exception of main and those merged by attribute externally_visible become static functions and in effect are optimized more aggressively by interprocedural optimizers.
Add the specified path to the list of paths that the linker will search for archive libraries and control scripts.
Add the specified path to the list of paths that the linker will search for archive libraries and control scripts.
Add the specified path to the list of paths that the linker will search for archive libraries and control scripts.
Produce debugging information.
Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may
require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed,
such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common
subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and
vectorization.
Many more details are available.
On x86 systems, allows use of instructions that require the listed architecture.
On Arm systems, specifies the name of the target architecture and, optionally, one or more feature modifiers. This option has the form -march=arch{+[no]feature}
Enable Link Time Optimization When invoked with source code, it generates GIMPLE (one of GCC's internal representations) and writes it to special ELF sections in the object file. When the object files are linked together, all the function bodies are read from these ELF sections and instantiated as if they had been part of the same translation unit.
Tells the optimizer to unroll loops whose number of iterations can be determined at compile time or upon entry to the loop.
Use the specified algorithm for basic block reordering. The algorithm argument can be 'simple', which does not increase code size (except sometimes due to secondary effects like alignment), or 'stc', the "software trace cache" algorithm, which tries to put all often executed code together, minimizing the number of branches executed by making extra copies of code.
Specify growth that the early inliner can make. In effect it increases the amount of inlining for code having a large abstraction penalty.
When you use -finline-functions (included in -O3), a lot of functions that would otherwise not be considered for inlining by the compiler are investigated. To those functions, a different (more restrictive) limit compared to functions declared inline can be applied.
Specifies maximal overall growth of the compilation unit caused by inlining. For example, parameter value 20 limits unit growth to 1.2 times the original size. Cold functions (either marked cold via an attribute or by profile feedback) are not accounted into the unit size.
The language standards set aliasing requirements: programmers are expected to follow conventions so that the compiler can keep track of memory. If a program violates the requirements (for example, using pointer arithmetic), programs may crash, or (worse) wrong answers may be silently produced.
Unfortunately, the aliasing requirements from the standards are not always well understood.
Sometimes, the aliasing requirements are understood and nevertheless intentionally violated by smart programmers who know what they are doing, such as the programmer responsible for the inner workings of Perl storage allocation and variable handling.
The -fno-strict-aliasing switch instructs the optimizer that it must not assume that the aliasing requirements from the standard are met by the current program. You will probably need it for 500.perlbench_r and 600.perlbench_s. Note that this is an optimization switch, not a portability switch. When running SPECint2017_rate_base or SPECint2017_speed_base, you must use the same optimization switches for all the C modules in base; see https://www.spec.org/cpu2017/Docs/runrules.html#BaseFlags and https://www.spec.org/cpu2017/Docs/runrules.html#MustValidate.
Tells GCC to use the GNU semantics for "inline" functions, that is, the behavior prior to the C99 standard. This switch may resolve duplicate symbol errors, as noted in the 502.gcc_r benchmark description.
Pretend the symbol is undefined, to force linking of library modules to define it. You can use -u multiple times with different symbols to force loading of additional library modules. E.g., "-u malloc -ljemalloc" can gurantee to link the malloc defined in jemalloc library. Without using it, the linking options order change may cause failing to link jemalloc.
Link with libjemalloc, a fast, arena-based memory allocator.
Sets the language dialect to include syntax from the 1998 ISO C++ standard plus the 2003 technical corrigendum.
On systems that support dynamic linking, this overrides -pie and prevents linking with the shared libraries. On other systems, this option has no effect.
Assume that the current compilation unit represents the whole program being compiled. All public functions and variables with the exception of main and those merged by attribute externally_visible become static functions and in effect are optimized more aggressively by interprocedural optimizers.
Add the specified path to the list of paths that the linker will search for archive libraries and control scripts.
Add the specified path to the list of paths that the linker will search for archive libraries and control scripts.
Add the specified path to the list of paths that the linker will search for archive libraries and control scripts.
Produce debugging information.
Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may
require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed,
such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common
subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and
vectorization.
Many more details are available.
On x86 systems, allows use of instructions that require the listed architecture.
On Arm systems, specifies the name of the target architecture and, optionally, one or more feature modifiers. This option has the form -march=arch{+[no]feature}
Enable Link Time Optimization When invoked with source code, it generates GIMPLE (one of GCC's internal representations) and writes it to special ELF sections in the object file. When the object files are linked together, all the function bodies are read from these ELF sections and instantiated as if they had been part of the same translation unit.
Tells the optimizer to unroll loops whose number of iterations can be determined at compile time or upon entry to the loop.
Use the specified algorithm for basic block reordering. The algorithm argument can be 'simple', which does not increase code size (except sometimes due to secondary effects like alignment), or 'stc', the "software trace cache" algorithm, which tries to put all often executed code together, minimizing the number of branches executed by making extra copies of code.
Specify growth that the early inliner can make. In effect it increases the amount of inlining for code having a large abstraction penalty.
When you use -finline-functions (included in -O3), a lot of functions that would otherwise not be considered for inlining by the compiler are investigated. To those functions, a different (more restrictive) limit compared to functions declared inline can be applied.
Specifies maximal overall growth of the compilation unit caused by inlining. For example, parameter value 20 limits unit growth to 1.2 times the original size. Cold functions (either marked cold via an attribute or by profile feedback) are not accounted into the unit size.
Assume that a loop with an exit will eventually take the exit and not loop indefinitely. This allows the compiler to remove loops that otherwise have no side-effects, not considering eventual endless looping as such.
Pretend the symbol is undefined, to force linking of library modules to define it. You can use -u multiple times with different symbols to force loading of additional library modules. E.g., "-u malloc -ljemalloc" can gurantee to link the malloc defined in jemalloc library. Without using it, the linking options order change may cause failing to link jemalloc.
Link with libjemalloc, a fast, arena-based memory allocator.
On systems that support dynamic linking, this overrides -pie and prevents linking with the shared libraries. On other systems, this option has no effect.
Assume that the current compilation unit represents the whole program being compiled. All public functions and variables with the exception of main and those merged by attribute externally_visible become static functions and in effect are optimized more aggressively by interprocedural optimizers.
Add the specified path to the list of paths that the linker will search for archive libraries and control scripts.
Add the specified path to the list of paths that the linker will search for archive libraries and control scripts.
Add the specified path to the list of paths that the linker will search for archive libraries and control scripts.
Produce debugging information.
Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may
require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed,
such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common
subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and
vectorization.
Many more details are available.
On x86 systems, allows use of instructions that require the listed architecture.
On Arm systems, specifies the name of the target architecture and, optionally, one or more feature modifiers. This option has the form -march=arch{+[no]feature}
Enable Link Time Optimization When invoked with source code, it generates GIMPLE (one of GCC's internal representations) and writes it to special ELF sections in the object file. When the object files are linked together, all the function bodies are read from these ELF sections and instantiated as if they had been part of the same translation unit.
Tells the optimizer to unroll loops whose number of iterations can be determined at compile time or upon entry to the loop.
Use the specified algorithm for basic block reordering. The algorithm argument can be 'simple', which does not increase code size (except sometimes due to secondary effects like alignment), or 'stc', the "software trace cache" algorithm, which tries to put all often executed code together, minimizing the number of branches executed by making extra copies of code.
IPA-CP calculates its own score of cloning profitability heuristics and performs those cloning opportunities with scores that exceed ipa-cp-eval-threshold.
Specifies maximal overall growth of the compilation unit caused by interprocedural constant propagation. For example, parameter value 10 limits unit growth to 1.1 times the original size.
Maximum depth of recursive cloning for self-recursive function.
-finline-functions-called-once, which is implied by -O1, considers all "static" functions called once for inlining into their caller even if they are not marked "inline". If a call to a given function is integrated, then the function is not output as assembler code in its own right.
-fno-inline-functions-called-once inhibits this optimization.
Enabled: Put all local arrays, even those of unknown size onto stack memory.
The -fno- form disables the behavior.
Place uninitialized global variables in a common block. This allows the linker to resolve all tentative definitions of the same variable in different compilation units to the same object. See also https://gcc.gnu.org/gcc-10/porting_to.html.
Write a linker map to the named file.
Write a linker map to the named file.
Write a linker map to the named file.
SPECrate runs might use one of these methods to bind processes to specific processors, depending on the config file.
Linux systems: the numactl command is commonly used. Here is a brief guide to understanding the specific command which will be found in the config file:
Solaris systems: The pbind command is commonly used, via
submit=echo 'pbind -b...' > dobmk; sh dobmk
The specific command may be found in the config file; here is a brief guide to understanding that command:
pbind -b causes this copy's processes to be bound to the CPU specified by the expression that follows it. See the config file used in the run for the exact syntax, which tends to be cumbersome because of the need to carefully quote parts of the expression. When all expressions are evaluated, the jobs are typically distributed evenly across the system, with each chip running the same number of jobs as all other chips, and each core running the same number of jobs as all other cores.
The pbind expression may include various elements from the SPEC toolset and from standard Unix commands, such as:
No special commands are needed for feedback-directed optimization, other than the compiler profile flags.
One or more of the following may have been used in the run. If so, it will be listed in the notes sections. Here is a brief guide to understanding them:
LD_LIBRARY_PATH=<directories> (set via config file preENV)
LD_LIBRARY_PATH controls the search order for libraries. Often, it can be defaulted. Sometimes, it is
explicitly set (as documented in the notes in the submission), in order to ensure that the correct versions of
libraries are picked up.
OMP_STACKSIZE=N (set via config file preENV)
Set the stack size for subordinate threads.
ulimit -s N
ulimit -s unlimited
'ulimit' is a Unix commands, entered prior to the run. It sets the stack size for the main process, either
to N kbytes or to no limit.
Model | Normal TDP | Minimum TDP | Maximum TDP |
---|---|---|---|
EPYC 9654 | 360 | 320 | 400 |
EPYC 9654P | 360 | 320 | 400 |
EPYC 9634 | 290 | 240 | 300 |
EPYC 9554 | 360 | 320 | 400 |
EPYC 9554P | 360 | 320 | 400 |
EPYC 9534 | 280 | 240 | 300 |
EPYC 9474F | 360 | 320 | 400 |
EPYC 9454 | 290 | 240 | 300 |
EPYC 9454P | 290 | 240 | 300 |
EPYC 9374F | 320 | 320 | 400 |
EPYC 9354 | 280 | 240 | 300 |
EPYC 9354P | 280 | 240 | 300 |
EPYC 9334 | 210 | 200 | 240 |
EPYC 9274F | 320 | 320 | 400 |
EPYC 9254 | 200 | 200 | 240 |
EPYC 9224 | 200 | 200 | 240 |
EPYC 9174F | 320 | 320 | 400 |
EPYC 9124 | 200 | 200 | 240 |
Flag description origin markings:
For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact info@spec.org
Copyright 2017-2024 Standard Performance Evaluation Corporation
Tested with SPEC CPU2017 v1.1.9.
Report generated on 2024-08-14 14:06:52 by SPEC CPU2017 flags formatter v5178.