Fujitsu Siemens Computers SPEC CPU2006 Flag Description
HEADER for OPTIMIZATION
]]>
HEADER for PORTABILITY
]]>
-Qprof_gen
This option instruments a program for profiling
as first step in Profile Guided Optimization.
Profile Guided Optimization (PGO) consists of 3 phases:
Phase 1: Compile and generate instrumented code in preparation
to gather profiling information (compiler flag -Qprof_gen).
Phase 2: Execute the instrumented code and gather profiling information.
Phase 3: Recompile the code and use the profiling information
for improved optimization (compiler flag -Qprof_use).
The option -Qprof_gen instruments a program
for profiling to get the execution count of each basic block.
It also creates a new static profile information file (.spi).
This flag is used in phase 1 of the Profile Guided Optimizer (PGO)
to instruct the compiler to produce code in your object files
in preparation for instrumented execution.
The instrumented code
- Gathers information regarding execution paths.
- Gathers information regarding data values.
- Does not use hardware performance counters.
]]>
-Qprof_use
This option enables the use of profiling information during optimization
as final step in Profile Guided Optimization.
Profile Guided Optimization (PGO) consists of 3 phases:
Phase 1: Compile and generate instrumented code in preparation
to gather profiling information (compiler flag -Qprof_gen).
Phase 2: Execute the instrumented code and gather profiling information.
Phase 3: Recompile the code and use the profiling information
for improved optimization (compiler flag -Qprof_use).
The option -Qprof_use instructs the compiler to use the profiling
information from phase 2 of PGO in order to produce a profile-optimized
executable (phase 3 of PGO).
It also enables function splitting (option -Qfnsplit)
and function grouping during optimization.
Note that there is no way to turn off function grouping
if you enable it using this option.
The recompilation with -Qprof_use
- Uses information regarding execution paths.
- Uses information regarding data values.
- Does not use hardware performance counters.
- Uses techniques (like function grouping) which are not available without PGO.
]]>
-fast
Maximizes speed across the entire program.
In Windows, it sets the following options:
-O3 -Qipo -Qprec-div- -QxT
Note that programs compiled with the -QxT option
will detect non-compatible processors and generate
an error message during execution.
The -QxT option that is set by the -fast option
cannot be overridden by other command line options.
If you specify -fast and a differnt processor-specific option,
such as -QxN, the compiler will issue a warning that explains
the -QxT option cannot be overridden.
]]>
-O3
Optimizes for speed. Enables high-level optimization. This level does
not guarantee higher performance. Using this option may increase the
compilation time. Impact on performance is application dependent, some
applications may not see a performance improvement.
The optimizations include:
- All optimizations done with -O2
- loop unrolling, including instruction scheduling
- code replication to eliminate branches
- padding the size of certain power-of-two arrays to allow more efficient cache use.
- When used with -Qax or -Qx, it causes the compiler to perform
more aggressive data dependency analysis than for -O2.
]]>
-Qprec-div-
-Qprec-div improves precision of floating-point divides.
It has a slight impact on speed.
-Qprec-div- disables this option.
With some optimizations, -QxN and -QxB,
the compiler may change floating-point division computations
into multiplication by the reciprocal of the denominator.
For example, A/B is computed as A * (1/B) to improve the speed
of the computation.
However, sometimes the value produced by this transformation
is not as accurate as full IEEE division.
When it is important to have fully precise IEEE division,
use this option to disable the floating-point
division-to-multiplication optimization.
The result is more accurate, with some loss of performance.
If you specify -Qprec-div-, it enables optimizations
that give slightly less precise results than full IEEE division.
Default is -Qprec-div
]]>
-QxT
-Qxprocessor This option directs the compiler
to generate specialized and optimized code for the Intel processor
that executes your program.
It lets you target your program to run on a specific Intel processor.
processor Is the processor
for which you want to target your program.
Here: T Code is optimized
generating SSSE3, SSE3, SSE2, and SSE instructions for Intel processors.
Code can be optimized for the Intel Core 2 Duo processor family.
The resulting code may contain unconditional use of features
that are not supported on other processors.
This option also enables new optimizations in addition to Intel
processor-specific optimizations including advanced data layout and code
restructuring optimizations to improve memory accesses for Intel processors.
Programs compiled with -QxT will display a fatal run-time error
if they are executed on unsupported processors.
]]>
-Qipo
-Qipo[n]
This option enables interprocedural optimizations between files.
This is also called multifile interprocedural optimization (multifile IPO)
or Whole Program Optimization (WPO).
When you specify this option, the compiler performs inline function expansion
for calls to functions defined in separate files.
You cannot specify the names for the object files that are created.
n Is an optional integer that specifies
the number of object files the compiler should create.
The integer must be greater than or equal to 0.
If you do not specify n, the default is 0.
If n is 0, the compiler decides whether to create one or more object files
based on an estimate of the size of the application.
It generates one object file for small applications,
and two or more object files for large applications.
If n is greater than 0, the compiler generates n object files,
unless n exceeds the number of source files (m),
in which case the compiler generates only m object files.
]]>
-O2
Optimizes for speed.
The -O2 option includes the following options:
- -Og
- -Oi-
- -Os
- -Oy
- -Ob2
- -GF
- -Gs
- -Gy
- -Qftz
This options defaults to ON.
This option also enables:
- inlining of intrinsics
- Intra-file interprocedural optimizations including:
- inlining
- constant propagation
- forward substitution
- routine attribute propagation
- variable address-taken analysis
- dead static function elimination
- removal of unreferenced variables.
- The following performance optimizations:
- copy propogation.
- dead-code elimination
- global register allocation
- global instruction scheduling and control speculation
- loop unrolliing
- optimized code selection
- partial redundancy elimination
- strength reduction/induction variable simplification
- variable renaming
- exception handling optimizations
- tail recursions
- peephole optimizations
- structure assignment lowering and optimization
- dead store elimination
]]>
-Qip
Enables single-file interprocedural optimizations within a file.
]]>
-Qparallel
This option tells the auto-parallelizer to generate multithreaded code
for loops that can be safely executed in parallel.
To use this option, you must also specify -O2 or -O3.
]]>
Enables cache/bandwidth optimization for stores
under conditionals (within vector loops).
This option tells the compiler to perform a conditional check
in a vectorized loop. This checking avoids unnecessary stores
and may improve performance by conserving bandwidth.
Enable compiler to generate runtime control code
for effective automatic parallelization.
This option generates code to perform run-time checks
for loops that have symbolic loop bounds.
If the granularity of a loop is greater than the parallelization threshold,
the loop will be executed in parallel. If you do not specify this option,
the compiler may not parallelize loops with symbolic loop bounds
if the compile-time granularity estimation of a loop can not ensure
it is beneficial to parallelize the loop.
Enable/disable(DEFAULT) use of ANSI aliasing rules in
optimizations; user asserts that the program adheres to
these rules.
-Qfnsplit
Enables function splitting.
This option enables function splitting if -Qprof-use is also specified.
Otherwise, this option has no effect.
It is enabled automatically if you specify -Qprof-use. If you do not specify
one of those options, the default is -Qfnsplit-, which disables
function splitting but leaves function grouping enabled.
To disable function splitting when you use -Qprof-use, specify -Qfnsplit-.
]]>
Select the method that the register allocator uses to partition
each routine into regions
- routine - one region per routine
- block - one region per block
- trace - one region per trace
- loop - one region per loop
- default - compiler selects best option
]]>
Multi-versioning is used for generating different versions of the loop based on
run time dependence testing, alignment and checking for short/long trip counts.
If this option is turned on, it will trigger more versioning at the expense
of creating more overhead to check for pointer aliasing and scalar replacement.
Specifies whether streaming stores are generated:
- always - enables generation of streaming stores under the assumption
that the application is memory bound
- auto - compiler decides when streaming stores are used (DEFAULT)
- never - disables generation of streaming stores
]]>
-Og
Enables global optimizations.
]]>
-Oi
Enables/disables inline expansion of intrinsic functions.
Default enabled
]]>
-Os
This option enables most speed optimizations, but disables some that increase
code size for a small speed benefit.
Default enabled
]]>
-Oy
Enables [disables] the use of the EBP register in optimizations.
When you disable with -Oy-, the EBP register is used as frame pointer.
-Oy has the effect of reducing the number of general-purpose registers by 1,
and can produce slightly less efficient code.
Default enabled
]]>
-Ob<n>
n = 0
Disables inlining of user-defined functions.
However, statement functions are always inlined
n = 1
Enables inlining of functions declared with the __inline keyword.
Also enables inlining according to the C++ language
n = 2
Enables inlining of any function.
However, the compiler decides which functions are inlined.
This option enables interprocedural optimizations and has the same
effect as specifying option Qip.
Default enabled with n = 2
]]>
-GF
This option enables read-only string-pooling optimization.
]]>
-Gs
Disables stack-checking for routines with n or more bytes of local
variables and compiler temporaries.
Default enabled with n = 4096.
]]>
-Oa
Assume [not assume] no aliasing
Default disabled
]]>
-Ot
Enables all speed optimizations.
Overrides -Os
]]>
-Ow
Assume[not assume] no cross function aliasing.
]]>
-Gf
Enables string-pooling optimization.
]]>
-Gy
Packages functions to enable linker optimization.
Default enabled
]]>
-QaxP
Generates specialized code for processor specific codes K, W, N, P while also generating generic IA-32 code.
- K = Intel Pentium III and compatible Intel processors
- W = Intel Pentium 4 and compatible Intel processors
- N = Intel Pentium 4 and compatible Intel processors.
These options also enable advanced data layout and code restructuring
optimizations to improve memory accesses for Intel processors.
- P = Intel Pentium 4 processor with Streaming SIMD 3 (SSE3) support.
These option also enable advanced data layout and code restructuring optimizations
to improve memory accesses for Intel processors.
]]>
-Qrcd
Enables[disables] fast conversions of floating-point to integer conversions.
This option does not guarantee that any particular rounding mode will be used.
]]>
-Qansi_alias
for C and C++
Qansi_alias directs the compiler to assume the following:
- Arrays are not accessed out of bounds.
- Pointers are not cast to non-pointer types, and vice-versa.
- References to objects of two different scalar types cannot alias.
For example, an object of type int cannot alias with an object of type float,
or an object of type float cannot alias with an object of type double.
If your program satisfies the above conditions, setting the -Qansi_alias
flag will help the compiler better optimize the program. However, if your
program does not satisfy one of the above conditions, the -Qansi_alias
flag may lead the compiler to generate incorrect code.
for Fortran
Enables (default) or disables the compiler to assume that the program adheres to the ANSI Fortran type aliasablility rules.
For example, an object of type real cannot be accessed as an integer.
You should see the ANSI Standard for the complete set of rules.
]]>
-Qfp_port
round fp results at assignments & casts (some speed impact)
]]>
-Qftz
This option flushes denormal results to zero when the application
is in the gradual underflow mode. It may improve performance
if the denormal values are not critical to your application's behavior.
This option only has an effect when the main program is being compiled.
It sets the ftz mode for the process.
]]>
-Qprefetch
This option enables prefetch insertion optimization.
The goal of prefetching is to reduce cache misses
by providing hints to the processor about when data
should be loaded into the cache.
Default is -Qprefetch- which disables this kind of optimization.
]]>
-Qunrolln tells the compiler the maximum number
of times to unroll loops.
If n is not specified, the optimizer determines
how many times loops can be unrolled.
If n is 0, loop unrolling is disabled.
]]>
Enables more aggressive unrolling heuristics
This option places local variables, except those declared as SAVE, to the run-time stack.
It is as if the variables were declared with the AUTOMATIC attribute.
It does not affect variables that have the SAVE attribute or ALLOCATABLE attribute,
or variables that appear in an EQUIVALENCE statement or in a common block.
This option may provide a performance gain for your program, but if your program depends on
variables having the same value as the last time the routine was invoked, your program may not
function properly.
If you want to cause variables to be placed in static memory, specify /Qsave (Windows).
]]>
-Zp
Specifies the strictest alignment constraint for structure and union types as 1, 2. 4. 8 or 16 bytes
Default is 16.
Problem: 16 is also possible. How to write regexp?
]]>
-arch:SSE
Enables the compiler to use SSE instructions.
]]>
-arch:SSE2
Enables the compiler to use SSE2 instructions.
]]>
-Qpc64
Enables floating-point significand precision control.
The value is used to round the significand to the correct number of bits.
The value must be either 32, 64 or 80.
Default enabled
]]>
-Ox
Same as the -O2 option: enables -Gs, and -Ob1, -Og, -Oy, and -Ot.
]]>
-auto
Determines whether local variables are put on the run-time stack.
]]>
-Qscalar_rep-
Enables[disables] scalar replacement performed during loop transformations.
(requires /O3).
]]>
-Qcxx-features
This option enables standard C++ features without disabling Microsoft
features within the bounds of what is provided in the Microsoft headers and
libraries.
This option has the same effect as specifying -GX -GR.
-GX Enables C++ exception handling.
-GR Enables C++ Run Time Type Information (RTTI).
]]>
-F10000
Specifies the stack reserve amount for the program.
-F<n>
<n> is the stack reserve amount.
It can be specified as a decimal integer or by using a C-style convention
for constants (for example, -F0x1000).
Default: The stack size default is chosen by the operating system.
]]>
-link -FORCE:MULTIPLE
Force Linking even if multiple entry names are found.
]]>
shlw32m.lib
Link with MicroQuill SmartHeap Library.
Available from
http://www.microquill.com/
]]>
shlw64m.lib
Link with MicroQuill SmartHeap Library (64-bit version).
Available from
http://www.microquill.com/
]]>
The use of -Qparallel to generate auto-parallelized code
requires support libraries that are dynamically linked by default.
Specifying libguide40.lib on the link line, statically links in libguide40.lib
to allow auto-parallelized binaries to work on systems which do not have the
dynamic version of this library installed.
]]>
-TP
-TP tells the compiler to process all source or unrecognized file types
as C++ source files.
Default: The compiler assumes that files with the extension .c or .C
are C source files.
To handle them as C++ source files, the compiler flag -TP is needed.
]]>
-DSPEC_CPU_NO_NEED_VA_COPY
Without -DSPEC_CPU_NO_NEED_VA_COPY, a Runtime Error (unhandled win32 exception)
occurs in 400.perlbench.
]]>
-Qlowercase
-Qlowercase causes the compiler to ignore case differences in identifiers
and to convert external names to lowercase.
It is needed to specify the naming convention for mixing C and Fortran codes.
]]>
-assume:underscore
-assume:[no]underscore
Determines whether the compiler appends an underscore character
to external user-defined names.
-assume:underscore is needed to specify the naming convention
for mixing C and Fortran codes.
]]>
-D_Complex=
-Qoption,cpp,--no_wchar_t_keyword
-Qoption,string,options This option
passes options to a specified tool.
string Is the name of the tool.
Here: cpp indicates the C++ preprocessor.
options Are one or more comma-separated,
valid options for the designated tool.
Here: --no_wchar_t_keyword is passed to C++ preprocessor to provide
the information that there is no wchar_t keyword.
This flag must be used with Microsoft Visual Studio 2005.
It avoids syntax errors coming from the use of wchar_t in 483.xalancbmk.
]]>
Invoke the
Intel C/C++ compiler for applications running on Intel 64.
Also used to invoke linker for C/C++ programs.
]]>
Set the path of the header files to be used by the
Intel C/C++ compiler for applications running on Intel 64.
]]>
Set the path of the library files to be used with object files fom the
Intel C/C++ compiler for applications running on Intel 64.
]]>
Set the path of libraries of Microsoft Visual Studio 2005.
Needed to link object files from the
Intel C/C++ compiler for applications running on Intel 64.
]]>
Set additional path of libraries of Microsoft Visual Studio 2005.
Needed to link object files from the
Intel C/C++ compiler for applications running on Intel 64.
]]>
Set additional path of libraries of Microsoft Visual Studio 2005.
Needed to link object files from the
Intel C/C++ compiler for applications running on Intel 64.
]]>
icl
Invoke Intel C/C++ compiler for applications
running on IA-32.
Also used to invoke linker for C/C++ programs.
]]>
icl
Invoke Intel Fortran compiler for applications
running on IA-32.
Also used to invoke linker for Fortran programs
and C/Fortran mixtures.
]]>
-Qc99
This option enables/disables C99 support for C programs.
]]>
-Qvc7.1
-Qvc8