=============================================== HP-UX Flag Descriptions for CPU2000 - July 2003 =============================================== ----------------------------------------------------------------- Common Flags for HP-UX F90 Compiler, C Compiler and aCC Compiler Compiler specific flags are mentioned below or in other notes ----------------------------------------------------------------- +Olevel Invoke optimizations selected by level. Defined values for level are: 0 Perform minimal optimizations. 1 Perform optimizations within basic blocks only. This is the default. 2 Perform level 1 and global optimizations. Same as -O. 3 Perform level 2 as well as interprocedural global optimizations within translation units. 4 Perform level 3 as well as doing interprocedural optimizations across translation units (link time optimizations). Requires concurrent use of the +Oprofile=use option. NOTE: Object files generated at this level contain an intermediate representation of the user code and are intended to be temporary files. These intermediate object files are not guaranteed to be compatible from one version of the compiler to the next. +O[no]datalayout Enables [disables] profile-driven layout of global and static data items to improve cache memory utilization. This option is currently ignored in the absence of the dynamic profile feedback option, +Oprofile=use. The default is +Onodatalayout. +O[no]dataprefetch Enable [disable] optimizations to generate data prefetch instructions for data structures referenced within innermost loops. +Odataprefetch is the same as +Odataprefetch=indirect. +Onodataprefetch is the same as +Odataprefetch=none. +Oentrysched Perform instruction scheduling on a subprogram's entry and exit code sequences. This option can be used at optimization level 1 and higher. The default is +Onoentrysched. +Ofast Select a combination of compilation options for optimum execution speed at build times. Currently: +O2, Onolimit, +Olibcalls, +Ofltacc=relaxed, -Wl,+mergeseg. This option is a synonym for -fast. Some of the linker settings above can be changed with chatr(1). +Ofaster This option selects +Ofast, but with an increased optimization level. If used with +Oprofile=use the optimization level will be +O4. Otherwise it will be +O3. +Ofltacc=level Controls the level of floating point optimizations that the compiler may perform. The defined values for level are: default Allows contractions, such as fused multiply-add (FMA), but disallows any other floating point optimization that can result in numerical differences. limited Like default, but also allows floating point optimizations which may affect the generation and propagation of infinities, NaNs, and the sign of zero. relaxed In addition to the optimizations allowed by limited, permits optimizations, such as reordering of expressions, even if parenthesized, that may affect rounding error. This is the same as +Onofltacc. strict Disallows any floating point optimization that can result in numerical differences. This is the same as +Ofltacc. The default is +Ofltacc=default. +Olibcalls NOTE: This option is deprecated and may not be supported in future releases. On Itanium(R)-based HP- UX, including a system header file will cause the functions declared therein to be eligible for libcalls transformations, regardless of the state of +O[no]libcalls. +O[no]initcheck Enable [disable] initialization to zero of any local, scalar, non-static variable that is uninitialized with respect to at least one path leading to its use. This optimization can occur at optimization levels 2, 3, and 4. The default is to enable initialization if the variable is uninitialized with respect to every path leading to its use. +O[no]inline Request [disable] inlining and cloning. This option can be used at optimization level 3 and higher. The default is +Oinline. +O[no]inline=function1[,function2...] Enable [disable] optimizer inlining for the named functions. This optimization can occur at optimization levels 3 and 4. The default is +Oinline. +Oinlinebudget=n aCC(1)/cc(1) +Oinline_budget=n f90(1) Perform more aggressive inlining, where n specifies the degree of aggressiveness, as follows: 100 Default level of inlining. > 100 More aggressive inlining at the expense of compilation time and code size. The maximum for n is 1000000. 2 - 99 Less aggressive inlining. The optimizer gives more weight to compilation time and code size when determining whether to inline. 1 Inline only if it reduces code size. This option can be used at optimization level 3 or higher. +O[no]limit Suppress [do not suppress] optimizations that significantly increase compile-time or consume enormous amounts of memory. +Olimit is the same as +Olimit=min. +Onolimit is the same as +Olimit=none. +Olimit=level Controls the amount of compile-time spent performing optimization. The defined values for level are: default Based on tuning heuristics, the optimizer will spend a reasonable amount of time processing large procedures. min For large procedures, the optimizer will avoid non-linear time optimizations. none The optimizer will fully optimize large procedures, possibly resulting in significantly increased compile time. +O[no]loop_unroll[=unroll_factor] Enable [disable] loop unrolling. This optimization can occur at optimization levels 2, 3, and 4. The default is +Oloop_unroll. The default is 4, that is, four copies of the loop body. The unroll_factor controls code expansion. +O[no]procelim Enable [disable] the elimination of functions that are not referenced by the application. Only functions with the hidden export class may be eliminated. The default is +Onoprocelim at optimization levels 0 and 1; at levels 2, 3 and 4, the default is +Oprocelim. +O[no]ptrs_to_globals[=name1,name2,...,nameN] Tell the optimizer whether global variables are modified [are not modified] through pointers. This optimization can occur at levels 2, 3, 4. The default is +Optrs_to_globals +O[no]recovery Generate [do not generate] recovery code for control speculation. The default is +Onorecovery. +Oshortdata[=size] All objects of size size bytes or smaller will be placed in the short data area, and references to such data will assume it resides in the short data area. Valid values of n are 0, or a decimal number between 8 and 4,194,304 (4MB). If no size is specified, all data is placed in the short data area. If size is 0, no data will be placed in the short data area, and all data references will use long offsets. The default is +Oshortdata=8. +O[no]type_safety=[off|limited|ansi|strong] Enable [disable] aliasing across types. off The default. Specifies that aliasing can occur freely across types. This is a synonym to +Onoptrs_ansi and +Onoptrs_strongly_typed options in cc. limited Code follows ANSI aliasing rules, and that unnamed objects should be treated as if they had an unknown type. ansi Code follows ANSI aliasing rules, and unnamed objects should be treated the same as named objects. This option is synonym to +Optrs_ansi option in cc. strong Code follows ANSI aliasing rules, except that accesses through lvalues of a character type are not permitted to touch objects of other types. This assumes that field addresses are not taken. This option is synonym to +Optrs_strongly_typed option in cc. -Bprotected[=symbol[,symbol...]] The named symbols, or all symbols if no symbols are specified, are assigned the protected export class. That means these symbols will not be preempted by symbols from other load modules, so the compiler may bypass the linkage table for both code and data references and bind them to locally defined code and data symbols. -Bprotected_data Marks only data symbols as having the protected export class. +DDdata_model Generate code using either the ILP32 or LP64 data model. Defined values for data_model are: 32 Use the ILP32 data model. The sizes of the int, long and pointer data types are 32-bits. 64 Use the LP64 data model. The size of the int data type is 32-bits, and the sizes of the long and pointer data types are 64-bits. Defines __LP64__ to the preprocessor. The default is +DD32. +DSmodel Perform instruction scheduling appropriate for a e.g. specific implementation of the architecture. +DSnative ON IPF the defined values for model are: blended Tune for best performance on a combination of processors (i.e., Itanium or Itanium 2 processor). itanium Tune for best performance on an Itanium processor. itanium2 Tune for best performance on an Itanium 2 processor. native Tune for best performance on the processor on which the compiler is running. +FPflag Specify how the environment for floating-point e.g. operations should be initialized at program +FPD start-up. By default, all behaviors are disabled. The following flags are supported (upper case flag enables; lower case flag disables): D (d) Enable sudden underflow (flush to zero) of denormalized values. -Wl,-asearch e.g. (ld option -a search) Specifies library search order. -Wl,-aarchive_shared Specify whether shared or archive libraries are searched with the -l option. The value of search should be one of archive, shared, archive_shared, shared_archive, or default. This option can appear more than once, interspersed among -l options, to control the searching for each library. The default is to use the shared version of a library if one is available, or the archive version if not. If either archive or shared is active, only the specified library type is accepted. If archive_shared is active, the archive form is preferred, but the shared form is allowed. If shared_archive is active, the shared form is preferred but the archive form is allowed. -dynamic Produces dynamically bound executables. See -minshared for partially statically bound executables. The default behavior is dynamic. -exec Indicates that any object files created will be used to create an executable file. Constants with a protected or hidden export class are placed in the read-only data section. This option also implies -Bprotected_def. -minshared Indicates that the result of the current compilation is going into an executable file that will make minimal use of shared libraries. Equivalent to -exec -Bprotected -Wl,-a,archive_shared. [Profile Feedback Related Options] +Oprofile=collect +Oprofile=collect[:] +I Instrument the application for profile based optimization. The profile collection are: arc Collect arc counts (equivalent to +Oprofile=collect). This is the default. stride Collect stride data. all Collect all types of profile data. Equivalent to the command +Oprofile=collect=arc,stride are a comma-separated list of profile collection qualifiers. +Oprofile=use +P Optimize the application based on profile data found in the database file flow.data, produced by compilation with +I. +P is equivalent to +Oprofile=use or +Oprofile=use:filename. See ld(1), +I, and +df, for more details. The +P option is incompatible with the +I and -S options. It is incompatible with the -g option only during compile time. ----------------------------------------------- Specific Flags for HP-UX F90 Compiler ----------------------------------------------- +cat Concatenates all source files of the same source form together, then compiles the concatenated source all at once. This enables inlining at +O3 within the concatenated file. ----------------------------------------------- Specific Flags for HP-UX C and aCC Compiler ----------------------------------------------- -AOe In addition to specifying the extended ANSI C language dialect as per -Ae (the default), allows the optimizer to aggressively exploit the assumption that the source code conforms to the ANSI programming language C standard ISO 9899:1990 plus the extensions. At present, the effect is to make +Otype_safety=ansi the default (it can of course be overridden). As new independently-controllable optimizations are developed that depend on the extended ANSI C standard, the flags that enable those optimizations may also become the default under -AOe. -Ae Turns on ANSI C c89 mode. This option allows compilation of c89 compatible C source programs just like C compiler. +inline_level [i]num This option controls how C/C++ inlining hints influence aCC or cc. Specify num as 0, 1, 2, or 3. num Meaning 0 No inlining is done (same effect as the +d option). 1 Only small functions are inlined. 2 Only large functions are not inlined. 3 Inlining hints are respected in all cases, except when the called function is recursive or when it has a variable number of arguments. The default level depends on +Olevel as shown in the following table: level num 0 1 1 1 2 2 3 2 4 2 If i is also specified, then implicit inlining is invoked for "small" functions without the inline keyword. NOTE: This option controls functions declared with the inline keyword or within the class declaration and is effective at all optimization levels. The options +Oinline and +Oinlinebudget control the high level optimizer that recognizes other opportunities in the same source file (+O3) or amongst all source files (+O4). ----------------------------------------------- Other descriptions ----------------------------------------------- -llapack Link in highly tuned math library functions found in the LAPACK library. B6061AA (HP MLIB) is an optional HP product which contains the LAPACK library. effmem.o Replacement for malloc/free that assumes ANSI compliance and improves spatial locality and minimizes memory usage by not maintaining a free list. fastmem.o Replacement for malloc/free that assumes ANSI compliance. ----------------------------------------------- Descriptions of Portability Flags ----------------------------------------------- +source={fixed|free|default} Accept source files in fixed format (+source=fixed) or free format (+source=free). The default, +source=default, is free for .f90 files and fixed for .f and .F source files. 176.gcc -DHOST_WORDS_BIG_ENDIAN : controls how bytes are numbered within a word. 181.mcf -DWANT_STDC_PROTO : allows use of the designated prototype. 186.crafty -DHP : selects header files and code paths compatible with HPUX. 252.eon -DFMAX_IS_DOUBLE : function fmax returns a double -DNDEBUG : do not include debug code -DSPEC_CPU2000_LP64 : use code to make longs and pointers 64 bit 253.perlbmk -DSPEC_CPU2000_HP : Compile the SPEC CPU2000 modified perl for an HPUX system. 254.gap: -DSYS_HAS_CALLOC_PROTO : allows use of the designated prototype -DSYS_HAS_IOCTL_PROTO : allows use of the designated prototype -DSYS_HAS_TIME_PROTO : allows use of the designated prototype -DSPEC_CPU2000_HP : selects header files and code paths compatible with HPUX. -DSYS_IS_USG : Compile for a USGish system. ----------------------------------------------- Descriptions of Kernel Tunables ----------------------------------------------- (Unless otherwise noted, units are in bytes) dbc_max_pct Maximum dynamic buffer cache size as a percent of system memory dbc_min_pct Minimum dynamic buffer cache size as a percent of system memory maxdsiz Maximum data size maxdsiz_64bit Maximum data size for 64 bit applications maxssiz Maximum stack size maxssiz_64bit Maximum stack size for 64 bit applications maxtsiz Maximum thread data size maxtsiz_64bit Maximum thread data size for 64 bit applications vps_ceiling Maximum System-Selected Page Size (in Kbytes) vps_pagesize Default user page size (in Kbytes) swapmem_on Swap to memory flag.