IBM XL Compiler Flags and Common Unix Commands and Environment Settings
Compilers: IBM XL C/C++ Enterprise Edition Version 8.0 for AIX
Compilers: IBM XL Fortran Enterprise Edition Version 10.1 for AIX
Compilers: IBM XL C/C++ Enterprise Edition Version 9.0 for AIX
Compilers: IBM XL Fortran Enterprise Edition Version 11.1 for AIX
Compilers: IBM XL C/C++ Version 10.1 for AIX
Compilers: IBM XL Fortran Version 12.1 for AIX
Last updated: 27-Oct-2008
Sections
Selecting one of the following will take you directly to that section:
-
-
-O5
- -O5\b
-
Perform optimizations for maximum performance. This includes maximum
interprocedural analysis on all of the objects presented on the "link"
step. This level of optimization will increase the compiler's memory
usage and compile time requirements. -O5 Provides all of the functionality
of the -O4 option, but also provides the functionality of the
-qipa=level=2 option.
-O5 is equivalent to the following flags
- Includes:
-
-
- -O3
- -O3\b
-
-O3 Performs additional optimizations that are memory intensive, compile-time
intensive, and may change the semantics of the program slightly, unless
-qstrict is specified. We recommend these optimizations when the desire for
run-time speed improvements outweighs the concern for limiting compile-time
resources. The optimizations provided include:
- In-depth memory access analysis
- Better loop scheduling
- High-order loop analysis and transformations (-qhot=level=0)
- Inlining of small procedures within a compilation unit by default
- Eliminating implicit compile-time memory usage limits
- Widening, which merges adjacent load/stores and other operations
- Pointer aliasing improvements to enhance other optimizations
-O3 is equivalent to the following flags
- Includes:
-
-
- -O
- -O\b
-
-O enables the level of optimization that represents the best tradeoff
between compilation speed and run-time performance.
If you need a specific level of optimization, specify the appropriate
numeric value.
Currently, -O is equivalent to -O2.
- Includes:
-
- -qarch
- -qarch=(\S+)\b
-
Produces object code containing instructions that will run on the
specified processors. "auto" selects the processor the compile
is being done on. "pwr5x" is the POWER5+ processor.
Supported values for this flag are
- auto
Use the processor on which the program is compiled.
- pwr6e
The POWER6 processor in "Enhanced" mode based systems.
- pwr6
The POWER6 processor based systems.
- pwr5x
The POWER5+ processor based systems.
- pwr5
The POWER5 processor based systems.
- pwr4
The POWER4 processor based systems.
- ppc970
The PPC970 processor based systems.
-
-
-
-
-qhot,
-qhot=level=1,
-qhot=simd
- -qhot(=arraypad|=simd|=vector|=level=[01])?\b
-
Performs high-order transformations on loops during optimization.
The supported values for suboption are:
- arraypad
The compiler will pad any arrays where it infers that there may be a benefit.
- level=0
The compiler performs a limited set of high-order loop transformations.
- level=1
The compiler performs its full set of high-order loop transformations.
- simd
Replaces certain instruction sequences with vector instructions.
- vector
Replaces certain instruction sequences with calls to the MASS library.
Specifying -qhot without suboptions implies -qhot=nosimd, -qhot=noarraypad, -qhot=vector and
-qhot=level=1. The -qhot option is also implied by -O4, and -O5.
-
-
-qipa=level
- -qipa=level=[012]\b
-
Enhances optimization by doing detailed analysis across procedures
(interprocedural analysis or IPA).
The level determines the amount of interprocedural analysis
and optimization that is performed.
level=0 Does only minimal interprocedural analysis and optimization
level=1 turns on inlining , limited alias analysis, and limited
call-site tailoring
level=2 turns on full interprocedural data flow and alias analysis
-
-
- -qpdf1
- -qpdf1\b
-
The option used in the first pass of a profile directed feedback compile
that causes pdf information to be generated.
The profile directed feedback optimization gathers data on both execution
path and data values. It does not use hardware counters, nor gather any
data other than path and data values for PDF specific optimizations.
-
-
-
-
-qxlf90=nosignedzero
- -qxlf90=(signedzero|nosignedzero|autodealloc|noautodealloc|oldpad|nooldpad|)\b
-
-qxlf90=
Determines whether the compiler provides the
Fortran 90 or the Fortran 95 level of support for
certain aspects of the language. can be
one of the following:
signedzero | nosignedzero
Determines how the SIGN(A,B) function handles
signed real 0.0. In addition, determines
whether negative internal values will be
prefixed with a minus when formatted output
would produce a negative sign zero.
autodealloc | noautodealloc
Determines whether the compiler deallocates
allocatable arrays that are declared locally
without either the SAVE or the STATIC
attribute and have a status of currently
allocated when the subprogram terminates.
oldpad | nooldpad
When the PAD=specifier is present in the
INQUIRE statement, specifying -qxlf90=nooldpad
returns UNDEFINED when there is no connection,
or when the connection is for unformatted I/O.
This behavior conforms with the Fortran 95
standard and above. Specifying -qxlf90=oldpad
preserves the Fortran 90 behavior.
Default:
o signedzero, autodealloc and nooldpad for the
xlf95, xlf95_r, xlf95_r7 and f95 invocation
commands.
o nosignedzero, noautodealloc and oldpad for
all other invocation commands.
-
-
-
-
-
-
-
-
-
-
- -qessl
- -qessl\b
-
Specifies that, if either -lessl or -lesslsmp are also
specified, then Engineering and Scientific Subroutine Library
(ESSL) routines should be used in place of some Fortran 90
intrinsic procedures when there is a safe opportunity to do so.
-
-
-
-qalias=noansi,
-qalias=nostd
- -qalias=(noansi|nostd)\b
-
qalias=ansi | noansi
If ansi is specified, type-based aliasing is
used during optimization, which restricts the
lvalues that can be safely used to access a
data object. The default is ansi for the xlc,
xlC, and c89 commands. This option has no
effect unless you also specify the -O option.
qalias=std |nostd
Indicates whether the compilation units contain
any non-standard aliasing (see Compiler Reference
for more information). If so, specify nostd.
-
-
-qalign=natural
- -qalign=(\S+)\b
-
Specifies what aggregate alignment rules the
compiler uses for file compilation, where the
alignment options are:
bit_packed
The compiler uses the bit_packed alignment
rules.
full
The compiler uses the RISC System/6000
alignment rules. This is the same as power.
mac68k
The compiler uses the Macintosh alignment
rules. This suboption is valid only for 32-
bit compilations.
natural
The compiler maps structure members to their
natural boundaries.
packed
The compiler uses the packed alignment rules.
power
The compiler uses the RISC System/6000
alignment rules.
twobyte
The compiler uses the Macintosh alignment
rules. This suboption is valid only for 32-
bit compilations. The mac68k option is the
same as twobyte.
The default is -qalign=full.
-
-
-
-
-qstrict,
-qnostrict
- -q(no)?strict\b
-
Ensures that optimizations done by default at
optimization levels -O3 and higher, and, optionally
at -O2, do not alter the semantics of a program.
The -qstrict=all, -qstrict=precision,
-qstrict=exceptions, -qstrict=ieeefp, and
-qstrict=order suboptions and their negative forms
are group suboptions that affect multiple,
individual suboptions. Group suboptions act as if
either the positive or the no form of every
suboption of the group is specified.
Default:
o Always -qstrict or -qstrict=all when the
-qnoopt or -O0 optimization level is in effect
o -qstrict or -qstrict=all is the default when
the -O2 or -O optimization level is in effect
o -qnostrict or -qstrict=none is the default
when -O3 or a higher optimization level is in
effect
is a colon-separated list of one
or more of the following:
all | none
all disables all semantics-changing
transformations, including those controlled by
the ieeefp, order, library, precision, and
exceptions suboptions. none enables these
transformations.
precision | noprecision
precision disables all transformations that
are likely to affect floating-point precision,
including those controlled by the subnormals,
operationprecision, association,
reductionorder, and library suboptions.
noprecision enables these transformations.
exceptions | noexceptions
exceptions disables all transformations likely
to affect exceptions or be affected by them,
including those controlled by the nans,
infinities, subnormals, guards, and library
suboptions. noexceptions enables these
transformations.
ieeefp | noieeefp
ieeefp disables transformations that affect
IEEE floating-point compliance, including
those controlled by the nans, infinities,
subnormals, zerosigns, and operationprecision
suboptions. noieeefp enables these
transformations.
nans | nonans
nans disables transformations that may produce
incorrect results in the presence of, or that
may incorrectly produce IEEE floating-point
signaling NaN (not-a-number) values. nonans
enables these transformations.
infinities | noinfinities
infinities disables transformations that may
produce incorrect results in the presence of,
or that may incorrectly produce floating-point
infinities. noinfinities enables these
transformations.
subnormals | nosubnormals
subnormals disables transformations that may
produce incorrect results in the presence of,
or that may incorrectly produce IEEE
floating-point subnormals (formerly known as
denorms). nosubnormals enables these
transformations.
zerosigns | nozerosigns
zerosigns disables transformations that may
affect or be affected by whether the sign of a
floating-point zero is correct. nozerosigns
enables these transformations.
operationprecision | nooperationprecision
operationprecision disables transformations
that produce approximate results for
individual floating-point operations.
nooperationprecision enables these
transformations.
order | noorder
order disables all code reordering between
multiple operations that may affect results or
exceptions, including those controlled by the
association, reductionorder, and guards
suboptions. noorder enables code reordering.
association | noassociation
association disables reordering operations
within an expression. noassociation enables
reordering operations.
reductionorder | noreductionorder
reductionorder disables parallelizing
floating-point reductions. noreductionorder
enables these reductions.
guards | noguards
guards disables moving operations past guards
or calls which control whether the operation
should be executed or not. enables these
moving operations.
library | nolibrary
library disables transformations that affect
floating-point library functions. nolibrary
enables these transformations.
-
-
-
-qipa=threads
- -qipa=threads(=\d+)?\b
-
The threads suboption allows the IPA optimizer to run portions
of the optimization process in parallel threads, which can speed up the
compilation process on multi-processor systems. All the available
threads, or the number specified by N, may be used. N must be a positive
integer. Specifying nothreads does not run any parallel threads;
this is equivalent to running one serial thread.
This option does not affect the code in the final binary created.
-
-
-
-
- fdpr -q -O4 -A 32 -bldcg -shci 90 -sdp 9
The fdpr command (Feedback Directed Program Restructuring) is a performance-tuning utility that may help
improve the execution time and the real memory utilization of user-level application programs. The fdpr program
optimizes the executable image of a program by collecting information on the behavior of the program while the
program is used for some typical workload, and then creating a new version of the program that is optimized for
that workload. The new program generated by fdpr typically runs faster and uses less real memory.
Usage:
fdpr [options] -p program [-x invocation]
where -p specifies the input program, in a form of executable, shared object
or archive file
-x specifies how to invoke the program
[options] can be one or more of the following:
Action Options:
-123 Specifies which actions/phases to run, where:
-1 generates instrumented program for profile gathering
-2 runs the instrumented program and updates profile data (requires -x )
-3 generates optimized program
Default is set to run all three phases (-123)
-a/--action [action] Specifies customized actions
where [action] can be one of the following:
anl analyze program
instr generate instrumented program for profile gathering (same as -1)
opt generate optimized program (same as -3)
check_sign check fdpr signature in the input program
Analysis Options:
-aawc/-noaawc, --analyze-assembly-written-csects/--noanalyze-assembly-written-csects
Analyze/Do not analyze objects written in Assembly. The
default is set to analyze modules written in Assembly
-acf , --analysis-configuration-file
Provide a configuration file of analysis information
(advanced option)
-asd, --analyze-static-data
Analyze static data objects as distinct data elements
for data reordering (unsafe for certain compilers)
-esa, --extra-safe-analysis
Limit analysis phase to compiler generated code
-fca, --funcsect-analysis
Apply special analysis for an input executable that was
compiled with the -qfuncsect compiler option
-ff , --file-format
Input file format: can be LM (load module) or PO
(program object)
-ifl , --ignored-function-list
Set the ignored function list. The file contains names
of functions that should not be instrumentated or
optimized
-iinf, --ignore-info Ignore .info sections produced with the -qfdpr option
during compile time
Instrumentation Options:
-anl, --analyze-program
Analyze the program but does not create any modified
binary. This option is used to provide dump of
profile/code coverage information. When used with the
-d option it will dump the disassembly of the original
program
-ccf , --code-coverage-file
Use file mapped to shared memory to collect coverage
information at run-time
-ccgi , --code-coverage-generate-info
Produce coverage information to given file based on
profile information. Use =XML for XML output and
=FLAT for flat formatted text file. The generated
file is