Note: The GNU Compiler Collection provides a wide array of compiler options, described in detail and readily available at https://gcc.gnu.org/onlinedocs/gcc/Option-Index.html#Option-Index and https://gcc.gnu.org/onlinedocs/gfortran/. This SPEC CPU flags file contains excerpts from and brief summaries of portions of that documentation.
SPEC's modifications are:
Copyright (C) 2006-2017 Standard Performance Evaluation Corporation
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being "Funding Free Software", the Front-Cover Texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in your SPEC CPU kit at $SPEC/Docs/licenses/FDL.v1.3 and on the web at http://www.spec.org/cpu2017/Docs/licenses/FDL.v1.3. A copy of "Funding Free Software" is on your SPEC CPU kit at $SPEC/Docs/licenses/FundingFreeSW and on the web at http://www.spec.org/cpu2017/Docs/licenses/FundingFreeSW.
(a) The FSF's Front-Cover Text is:
A GNU Manual
(b) The FSF's Back-Cover Text is:
You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development.
Invokes the GNU C compiler.
Invokes the GNU C++ compiler.
Invokes the GNU Fortran compiler.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This macro indicates that Fortran functions called from C should have their names lower-cased.
Use big-endian representation for unformatted files. This is important when reading 521.wrf_r, 621.wrf_s, and 628.pop2_s data files that were originally generated in big-endian format.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
Let the type "char" be unsigned, like "unsigned char".
Note: this particular portability flag is included for 526.blender_r per the recommendation in its documentation - see http://www.spec.org/cpu2017/Docs/benchmarks/526.blender_r.html.
Linux portability
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
Fortran to C symbol naming. C symbol names are lower case with one underscore. _symbol
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
Generate code for lp64. With ilp32, int, long int and pointer are 32-bit; with lp64, int is 32-bit, but long int and pointer are 64-bit.
Sets the language dialect to include syntax from the C99 standard, such as bool and other features used in CPU2017 benchmarks.
Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may
require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed,
such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common
subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and
vectorization.
Many more details are available.
Produce debugging information.
Performance improvement.Use pipes rather than temporary files for communication between the various stages of compilation.
This option runs the standard link-time optimizer. When invoked with source code, it generates GIMPLE (one of GCC’s internal representations) and writes it to special ELF sections in the object file. When the object files are linked together, all the function bodies are read from these ELF sections and instantiated as if they had been part of the same translation unit.
Enable Link Time OptimizationOn x86 systems, allows use of instructions that require the listed architecture. On arm systems, specifies the name of the target architecture and, optionally, one or more feature modifiers. This option has the form -march=arch{+[no]feature}*
It is the code-gen option.
Omit the frame pointer in functions that don’t need one. This avoids the instructions to save, set up and restore the frame pointer; on many targets it also makes an extra register available.
It is the linker option. Don’t produce a dynamically linked position independent executable.
Tells the optimizer to unroll loops whose number of iterations can be determined at compile time or upon entry to the loop.
Generate code for lp64. With ilp32, int, long int and pointer are 32-bit; with lp64, int is 32-bit, but long int and pointer are 64-bit.
Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may
require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed,
such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common
subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and
vectorization.
Many more details are available.
Produce debugging information.
Performance improvement.Use pipes rather than temporary files for communication between the various stages of compilation.
This option runs the standard link-time optimizer. When invoked with source code, it generates GIMPLE (one of GCC’s internal representations) and writes it to special ELF sections in the object file. When the object files are linked together, all the function bodies are read from these ELF sections and instantiated as if they had been part of the same translation unit.
Enable Link Time OptimizationOn x86 systems, allows use of instructions that require the listed architecture. On arm systems, specifies the name of the target architecture and, optionally, one or more feature modifiers. This option has the form -march=arch{+[no]feature}*
It is the code-gen option.
Omit the frame pointer in functions that don’t need one. This avoids the instructions to save, set up and restore the frame pointer; on many targets it also makes an extra register available.
It is the linker option. Don’t produce a dynamically linked position independent executable.
Tells the optimizer to unroll loops whose number of iterations can be determined at compile time or upon entry to the loop.
Generate code for lp64. With ilp32, int, long int and pointer are 32-bit; with lp64, int is 32-bit, but long int and pointer are 64-bit.
Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may
require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed,
such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common
subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and
vectorization.
Many more details are available.
Produce debugging information.
Performance improvement.Use pipes rather than temporary files for communication between the various stages of compilation.
This option runs the standard link-time optimizer. When invoked with source code, it generates GIMPLE (one of GCC’s internal representations) and writes it to special ELF sections in the object file. When the object files are linked together, all the function bodies are read from these ELF sections and instantiated as if they had been part of the same translation unit.
Enable Link Time OptimizationOn x86 systems, allows use of instructions that require the listed architecture. On arm systems, specifies the name of the target architecture and, optionally, one or more feature modifiers. This option has the form -march=arch{+[no]feature}*
It is the code-gen option.
Omit the frame pointer in functions that don’t need one. This avoids the instructions to save, set up and restore the frame pointer; on many targets it also makes an extra register available.
It is the linker option. Don’t produce a dynamically linked position independent executable.
Tells the optimizer to unroll loops whose number of iterations can be determined at compile time or upon entry to the loop.
Generate code for lp64. With ilp32, int, long int and pointer are 32-bit; with lp64, int is 32-bit, but long int and pointer are 64-bit.
Sets the language dialect to include syntax from the C99 standard, such as bool and other features used in CPU2017 benchmarks.
Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may
require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed,
such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common
subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and
vectorization.
Many more details are available.
Produce debugging information.
Performance improvement.Use pipes rather than temporary files for communication between the various stages of compilation.
This option runs the standard link-time optimizer. When invoked with source code, it generates GIMPLE (one of GCC’s internal representations) and writes it to special ELF sections in the object file. When the object files are linked together, all the function bodies are read from these ELF sections and instantiated as if they had been part of the same translation unit.
Enable Link Time OptimizationOn x86 systems, allows use of instructions that require the listed architecture. On arm systems, specifies the name of the target architecture and, optionally, one or more feature modifiers. This option has the form -march=arch{+[no]feature}*
It is the code-gen option.
Omit the frame pointer in functions that don’t need one. This avoids the instructions to save, set up and restore the frame pointer; on many targets it also makes an extra register available.
It is the linker option. Don’t produce a dynamically linked position independent executable.
Tells the optimizer to unroll loops whose number of iterations can be determined at compile time or upon entry to the loop.
Generate code for lp64. With ilp32, int, long int and pointer are 32-bit; with lp64, int is 32-bit, but long int and pointer are 64-bit.
Sets the language dialect to include syntax from the C99 standard, such as bool and other features used in CPU2017 benchmarks.
Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may
require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed,
such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common
subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and
vectorization.
Many more details are available.
Produce debugging information.
Performance improvement.Use pipes rather than temporary files for communication between the various stages of compilation.
This option runs the standard link-time optimizer. When invoked with source code, it generates GIMPLE (one of GCC’s internal representations) and writes it to special ELF sections in the object file. When the object files are linked together, all the function bodies are read from these ELF sections and instantiated as if they had been part of the same translation unit.
Enable Link Time OptimizationOn x86 systems, allows use of instructions that require the listed architecture. On arm systems, specifies the name of the target architecture and, optionally, one or more feature modifiers. This option has the form -march=arch{+[no]feature}*
It is the code-gen option.
Omit the frame pointer in functions that don’t need one. This avoids the instructions to save, set up and restore the frame pointer; on many targets it also makes an extra register available.
It is the linker option. Don’t produce a dynamically linked position independent executable.
Tells the optimizer to unroll loops whose number of iterations can be determined at compile time or upon entry to the loop.
Generate code for lp64. With ilp32, int, long int and pointer are 32-bit; with lp64, int is 32-bit, but long int and pointer are 64-bit.
Sets the language dialect to include syntax from the C99 standard, such as bool and other features used in CPU2017 benchmarks.
Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may
require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed,
such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common
subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and
vectorization.
Many more details are available.
Produce debugging information.
Performance improvement.Use pipes rather than temporary files for communication between the various stages of compilation.
This option runs the standard link-time optimizer. When invoked with source code, it generates GIMPLE (one of GCC’s internal representations) and writes it to special ELF sections in the object file. When the object files are linked together, all the function bodies are read from these ELF sections and instantiated as if they had been part of the same translation unit.
Enable Link Time OptimizationOn x86 systems, allows use of instructions that require the listed architecture. On arm systems, specifies the name of the target architecture and, optionally, one or more feature modifiers. This option has the form -march=arch{+[no]feature}*
It is the code-gen option.
Omit the frame pointer in functions that don’t need one. This avoids the instructions to save, set up and restore the frame pointer; on many targets it also makes an extra register available.
It is the linker option. Don’t produce a dynamically linked position independent executable.
Tells the optimizer to unroll loops whose number of iterations can be determined at compile time or upon entry to the loop.
SPECrate runs might use one of these methods to bind processes to specific processors, depending on the config file.
Linux systems: the numactl command is commonly used. Here is a brief guide to understanding the specific command which will be found in the config file:
Solaris systems: The pbind command is commonly used, via
submit=echo 'pbind -b...' > dobmk; sh dobmk
The specific command may be found in the config file; here is a brief guide to understanding that command:
pbind -b causes this copy's processes to be bound to the CPU specified by the expression that follows it. See the config file used in the run for the exact syntax, which tends to be cumbersome because of the need to carefully quote parts of the expression. When all expressions are evaluated, the jobs are typically distributed evenly across the system, with each chip running the same number of jobs as all other chips, and each core running the same number of jobs as all other cores.
The pbind expression may include various elements from the SPEC toolset and from standard Unix commands, such as:
No special commands are needed for feedback-directed optimization, other than the compiler profile flags.
One or more of the following may have been used in the run. If so, it will be listed in the notes sections. Here is a brief guide to understanding them:
LD_LIBRARY_PATH=<directories> (set via config file preENV)
LD_LIBRARY_PATH controls the search order for libraries. Often, it can be defaulted. Sometimes, it is
explicitly set (as documented in the notes in the submission), in order to ensure that the correct versions of
libraries are picked up.
OMP_STACKSIZE=N (set via config file preENV)
Set the stack size for subordinate threads.
ulimit -s N
ulimit -s unlimited
'ulimit' is a Unix commands, entered prior to the run. It sets the stack size for the main process, either
to N kbytes or to no limit.
Select only test related files when installing the operating system,So that many services are not installed, this will reduce the consumption of resources by the operating system itself. In accordance with the following methods to install the operating system: 1.The software installation mode was selected 'Customize now'. 2.Next,In 'base System' column, We choose the following installation package,'Base','Compatibility Libraries', 'Java Platform','Large Systems Performance','Performance Tools','Perl Support'.In 'Development' column, We choose the following installation package,'Development tools'.That is all the installation package.
"cpupower frequency-set" provides a simplified mechanism to adjust processor frequencies when cpu frequency scaling is enabled in the OS. See the cpupower-frequency-set man page for details.Here is a brief description of options used in the config file. By default, settings are applied to all logical cpus in the system.Frequencies can be passed in Hz, kHz (default), MHz, GHz, or THz by postfixing the value with the desired unit name, without any space. Available frequencies and governors can be determined with "cpupower frequency-info".
Tmpfs is a file system which keeps all files in virtual memory.A tmpfs file system will go to swap if memory pressure demands real memory for applications. This can have a very negative effect on the I/O load and system performance
Each process is assigned a time period, known as its time slice, that is the time allowed to run the process. Increse the process time slice can have a positive effect on the calculated sensitivity task. The related kernel parameters are sched_wakeup_granularity_ns, sched_min_granularity_ns, etc.
Transparent Hugepages increase the memory page size from 4 kilobytes to 2 megabytes. Transparent Hugepages provide significant performance advantages on systems with highly contended resources and large memory workloads. If memory utilization is too high or memory is badly fragmented which prevents hugepages being allocated, the kernel will assign smaller 4k pages instead.
On RedHat EL6 and later, Transparent Hugepages are used by default if /sys/kernel/mm/transparent_hugepage/enabled is set to always. The default value is always.
On SUSE SLES11 and later, Transparent Hugepages are used by default if /sys/kernel/mm/transparent_hugepage/enabled is set to always. The default value is always.
nohz_full: This kernel option sets adaptive tick mode (NOHZ_FULL) to specified porcessors. Since the number of interrupts is reduced to ones per second, latency-sensitive applications can take advantage of it.
This BIOS option allows the enabling/disabling of a processor mechanism to prefetch data into the cache according to a pattern-recognition algorithm In some cases, setting this option to Disabled may improve performance. Users should only disable this option after performing application benchmarking to verify improved performance in their environment.
This BIOS option specifies the memory refresh rate.
Values for this BIOS setting can be:
32ms: specifies the memory refresh rate to 32ms.
64ms: specifies the memory refresh rate to 64ms.
Auto: specifies the memory refresh rate to Auto.Server will changes the memory refresh rate along the temperature of the memory.
Values for this BIOS setting can be:
Efficiency: Maximize the power efficiency of the server.
Performance: Maximize the performance of the server.
The Baseboard Management Controller allows the user to adjust the fan speed manually,If the server is in a stressful environment, the CPU have high temperature, you can adjust the fan speed to 100%.
This BIOS option allows the enabling/disabling of a processor mechanism to contiol the interleaving between CPU DIE. In some cases, setting this option to Disabled may improve performance.
This BIOS option allows the enabling/disabling of a processor mechanism to contiol the interleaving between memory channels. In some cases, setting this option to Enabled may improve performance.
This BIOS option controls the cache mode setting of the controllers.
Values for this BIOS setting can be:
in: partition out: share.Set the controllers L3 cache mode. In chip is set to partitioned mode.Out of chip is set to shared mode.
in: share out: share.In chip is set to shared mode.Out of chip is set to shared mode.
in: private out: share.In chip is set to private mode.Out of chip is set to shared mode.
in: private out: private.In chip is set to private mode.Out of chip is set to private mode.
In some cases, such as the cores of cpu are 196,setting this option to "in: private out: private" may improve performance.
Share mode means the L3 cache is shared by all L2 processes, a process can use the capacity of the entire L3.
Private mode, the entire L3 is divided into N private L3s, each private L3 node only caches the data of the corresponding L2 node, that means a process can only ues part of the L3 capacity.
Partition mode, the entire L3 is divided into N private L3s, each L2 cache accesses its corresponding L3 preferentially, and it can also access the other private L3s.
"N" mentioned above is determined by the CPU cores count, every 4 cores are a cluster sharing one piece private L3 cache.
Flag description origin markings:
For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact info@spec.org
Copyright 2017-2020 Standard Performance Evaluation Corporation
Tested with SPEC CPU2017 v1.1.0.
Report generated on 2020-07-07 14:29:53 by SPEC CPU2017 flags formatter v5178.