===================================================== HP-UX Flag Descriptions for CPU2000 - June 26, 2006 ===================================================== This file is organized into five sections as follows: 1. Compiler and Linker Flags 2. Other Descriptions (libraries) 3. Kernel Tunables 4. Portability Flags 5. Other Options and commands (utilities and filesystem options) ----------------------------------------------------------------- Compiler and Linker flags HP Ansi C and aCC versions 6.12, and Fortran90 version 11.23.32, with August 2006 patches ----------------------------------------------------------------- -Aa For aC++, turns on newly supported ANSI C++ Standard features like Koenig lookup and correct scoping for variables declared in conditional statements like for- loops. Additional features may be enabled by this option in the future. For C, compile under strict ANSI mode (ANSI programming language C standard ISO 9899:1990). When compiling under ANSI mode, the header files would define only those names (macros and typedefs) specified by the Standard. To access macros and typedefs that are not defined by the ANSI Standard but are provided by the HP-UX Operating System, define the symbol _HPUX_SOURCE; or use the extension option, -Ae described below. -Ae In aC++, invokes aCC as an ANSI-C compiler, with additional support for HP-C language extensions. Refer to the HP-C compiler documentation for details of the HP-C language extensions. In C, this is the default, Extended ANSI mode. Same as -Aa -D_HPUX_SOURCE +e. This would define the names (macros and typedefs) provided by the HP-UX Operating System and, in addition, allows extensions such as C99 features in this release (complex and imaginary data types, STDC pragmas), $ characters in identifier names, sized enums, and sized bit-fields, and 64-bit integral types. Additional extensions may be added to this option in the future. -AOa,-AOe See corresponding -Aa or -Ae option. In addition, allows the optimizer to aggressively exploit the assumption that the source code conforms to the ANSI programming language C or C++ standard. At present, the effect is to make +Otype_safety=ansi the default. As new independently-controllable optimizations are developed that depend on the ANSI standards, the flags that enable those optimizations may also become the default under -AOa/-AOe. -Bprotected[=symbol[,symbol...]] The named symbols, or all symbols if no symbols are specified, are assigned the protected export class. That means these symbols will not be preempted by symbols from other load modules, so the compiler may bypass the linkage table for both code and data references and bind them to locally defined code and data symbols. When used with no symbol list, -Bprotected implies -Wl,-aarchive_shared, causing the linker to prefer an archive library over a shared library if one is available. This can be overridden by following the -Bprotected option with a subsequent -Wl,-a option. -Bprotected_data Marks only data symbols as having the protected export class. -Bprotected_def Same as -Bprotected but only locally defined (non- tentative) symbols are assigned the protected export class. +cat (f90 only) Concatenates all source files of the same source form together, then compiles the concatenated source all at once. This enables inlining at +O3 within the concatenated file. +DD32 Use the ILP32 data model. The sizes of the int, long and pointer data types are 32-bits. Defines _ILP32 to the preprocessor. This is the default. +DD64 Use the LP64 data model. The size of the int data type is 32-bits, and the sizes of the long and pointer data types are 64-bits. Defines __LP64__ and _LP64 to the preprocessor. +DSmodel Perform instruction scheduling appropriate for a specific implementation of the architecture. ON IPF the defined values for model are: +DSblended Tune for best performance on a combination of processors (i.e. Itanium, Itanium2, or native processor). +DSitanium Tune for best performance on an Intel(R) Itanium(R) processor. +DSitanium2 Tune for best performance on an Intel(R) Itanium(R) 2 processor. +DSnative Tune for best performance on the processor on which the compiler is running. The default model is +DSblended. -dynamic Produces dynamically bound executables. See -minshared for partially statically bound executables. The default behavior is dynamic. -exec Indicates that any object files created will be used to create an executable file. Constants with a protected or hidden export class are placed in the read-only data section. This option also implies -Bprotected_def. -fast This is a synonym for +Ofast. +FP flag Specify how the environment for floating-point operations should be initialized at program start-up. By default, all behaviors are disabled. The following flags are supported (upper case flag enables; lower case flag disables): +FPD (d) Enable sudden underflow (flush to zero) of denormalized values. +FPI (i) Trap on floating-point operations that produce inexact results. +FPN (n) Trap on Denormal|Unnormal operand floating-point operation. +FPO (o) Trap on floating-point overflow. +FPU (u) Trap on floating-point underflow. +FPV (v) Trap on invalid floating-point operations. +FPZ (z) Trap on divide by zero. +inline_level [num] This option controls how C/C++ inlining hints influence aCC or cc. Specify num as an integral value between 0 and 9. num Meaning 0 No inlining is done (same effect as the +d option). 1 Only functions marked with the inline keyword or implied by the language to be inline are considered for inlining. 2 More inlining than level 1. This is the default level at optimization levels +O2, +O3, and +O4. 3-8 Increasing levels of inliner aggressiveness. 9 Attempt to inline all functions other than recursive functions or those with a variable number of arguments. The default level depends on +Olevel as shown in the following table: level num 0 1 1 1 2 2 3 2 4 2 NOTE: The options +Oinline and +Oinline_budget also influence inlining aggressiveness. -ipo Enable interprocedural optimizations across files. This option is ignored at optimization levels +O0 and +O1. It is enabled by default when +O4 or +Ofaster are used. -minshared Indicates that the result of the current compilation is going into an executable file that will make minimal use of shared libraries. Equivalent to -exec -Bprotected. This option is only supported on HP-UX version 11i and later. +noeh (C++ only) Disable exception handling. Note that mixing objects compiled both with and without +noeh can have undesired results. Object destruction, for example, will not be done for objects local to functions compiled with the +noeh option. +Olevel Invoke optimizations selected by level. For C and C++, defined values are: +O0 Perform minimal optimizations. +O1 Perform optimizations within basic blocks only. This is the default. +O2 Perform level 1 and global optimizations. Enable automatic inlining. Same as -O. +O3 Perform level 2 as well as interprocedural global optimizations within translation units. +O4 Perform level 3 as well as doing interprocedural optimizations across translation units (link time optimizations). See -ipo NOTE: Object files produced using this option contain intermediate code in the IELF format. At link time, ld automatically invokes the interprocedural optimizer u2comp if any of the input object files is an IELF file. These IELF object files are intended to be temporary files. They are not guaranteed to be compatible from one version of the compiler to the next. For Fortran90, defined values for level are: +O0 Minimal optimization, fastest compile time, best debugging support. This is the default. +O1 Block-level optimizations, moderately fast compile time, moderate improvement in runtime performance. +O2 Full optimization within each subprogram in a file. Marked improvement in runtime performance, noticeably longer compile time, program transformations more pronounced than at lower levels. +O3 Full optimization across all subprograms within the source file, including subprogram cloning and inlining. This level of optimization can greatly improve the runtime performance of programs that make frequent procedure calls. +O4 Perform level 3 as well as doing interprocedural optimizations across translation units (link time optimizations). Object files generated at this level contain an intermediate representation of the user code and are intended to be temporary files. These intermediate object files are not guaranteed to be compatible from one version of the compiler to the next. Requires concurrent use of the +Oprofile=use option. This option is only supported on Itanium(R) processor family architecture. +Odatalayout [+Onodatalayout] Enables [disables] profile-driven layout of global and static data items to improve cache memory utilization. This option is currently enabled if +Oprofile=use (dynamic profile feedback) is specified. The default in the absence of +Oprofile=use is +Onodatalayout. +Odataprefetch [+Onodataprefetch] Enable [disable] optimizations to generate data prefetch instructions for data structures referenced within innermost loops. +Odataprefetch is the same as +Odataprefetch=indirect. +Onodataprefetch is the same as +Odataprefetch=none. +Odataprefetch=kind Control generation of data prefetch instructions for data structures referenced within innermost loops. The defined values for kind are: direct Enable generation of data prefetch instructions for the benefit of direct memory accesses, but not indirect memory accesses. indirect Enable generation of data prefetch instructions for the benefit of both direct and indirect memory accesses. This is the default at optimization levels +O2 and above. none Disable generation of data prefetch instructions. This is the default at optimization levels +O1 and below. +Oentrysched (f90 only) Perform instruction scheduling on a subprogram's entry and exit code sequences. This option can be used at optimization level 1 and higher. The default is +Onoentrysched. +Ofast,-fast This option selects a combination of compilation and link options for optimum execution speed and reasonable build times. Currently (C, C++): +O2, +Onolimit, +Ofltacc=relaxed, +DSnative, +FPD, -Wl,+pi,1M, -Wl,+pd,1M and -Wl,+mergeseg. Some of the linker settings above can be changed with chatr(1). (Fortran90): +O2, +Onolimit, +Ofltacc=relaxed, +DSnative, +FPD, -Wl,+pi,1M, -Wl,+pd,1M, -Wl,+mergeseg +Ofaster This option selects +Ofast, but with the optimization level increased to +O4. +Ofltacc=level, +Ofltacc, +Onofltacc Controls the level of floating point optimizations that the compiler may perform. The defined values for level are: +Ofltacc=default Allows contractions, such as fused multiply-add (FMA), but disallows any other floating point optimization that can result in numerical differences. +Ofltacc=limited Like default, but also allows floating point optimizations which may affect the generation and propagation of infinities, NaNs, and the sign of zero. +Ofltacc=relaxed, +Onofltacc In addition to the optimizations allowed by limited, permits optimizations, such as reordering of expressions, even if parenthesized, that may affect rounding error. +Ofltacc=strict, +Ofltacc Disallows any floating point optimization that can result in numerical differences. The default is +Ofltacc=default. +Oinitcheck [+Onoinitcheck] Enable [disable] initialization to zero of any local, scalar, non-static variable that is uninitialized with respect to at least one path leading to its use. This optimization can occur at optimization levels 2, 3, and 4. The default is to enable initialization if the variable is uninitialized with respect to every path leading to its use. +Oinline [+Onoinline] Enable [disable] optimizer inlining for all functions in the compilation unit. This optimization can occur at optimization levels 3 and 4. The default is +Oinline. +Oinline_budget=n The +Oinline_budget option controls the compile time budget for the inliner. A lower number causes the inliner to consider fewer candidates for inlining, while a higher number leads it to consider more candidates. The inlining candidates are ordered in priority order based on the inliner's heuristics, so this does not affect the most important candidates. n is an integer in the range 1 - 1000000 that specifies the level of aggressiveness as follows: n Meaning = 100 Default compile time budget. > 100 Allow the inliner to consider more candidates and increase compile time. 1 - 99 Restrict the inliner to consider fewer candidates to reduce compile time. This optimization can occur at optimization levels 3, and 4. The default is +Oinline_budget=100. +Ointeger_overflow=type Controls the integer overflow assumptions made by the compiler to provide the best runtime performance. The defined values for type are: +Ointeger_overflow=aggressive Allows the compiler to make a broad set of assumptions so that the integer arithmetic expressions do not overflow. +Ointeger_overflow=moderate This is the same as aggressive except that the LFTR (linear function test replacement) optimization is not performed. The default at optimization levels 2, 3 and 4. +Ointeger_overflow=conservative Directs the compiler to make fewer assumptions so that the integer arithmetic expressions do not overflow. +Olibcalls,+Onolibcalls (C, C++): This option is deprecated and may not be supported in future releases. On Itanium(R)-based HP-UX, including a system header file will cause the functions declared therein to be eligible for libcalls transformations, regardless of the state of +O[no]libcalls. (Fortran90): Use [do not use] low-call-overhead versions of select library routines. This option can be used at any level. At optimization level 0 or 1, the default is +Onolibcalls; at optimization level 2 or higher, the default is +Olibcalls. +Olit=level Controls which data items are placed in the read- only data section. The defined values for level are: all All string literals and all const- qualified variables that do not require load-time or run-time initialization will be placed in the read-only data section. +Olit=all replaces the deprecated +ESlit option. const All string literals appearing in a context where const char * is legal, and all const-qualified variables that do not require load-time or run-time initialization will be placed in the read-only data section. none No constants are placed in the read-only data section. +Olit=none replaces the deprecated +ESnolit option. The default is +Olit=all for C++ and +Olit=const for C. +Olimit [+Onolimit] Suppress [do not suppress] optimizations that significantly increase compile-time or consume enormous amounts of memory. +Olimit=level (C, C++) Controls the amount of compile-time spent performing optimization. The defined values for level are: default Based on tuning heuristics, the optimizer will spend a reasonable amount of time processing large procedures. min For large procedures, the optimizer will avoid non-linear time optimizations. none The optimizer will fully optimize large procedures, possibly resulting in significantly increased compile time. +Oloop_unroll[=unroll_factor], [+Onoloop_unroll] Enable [disable] loop unrolling. This optimization can occur at optimization levels 2, 3, and 4. The default is +Oloop_unroll, where the optimizer uses its own heuristics to select an unroll factor for each loop. The unroll_factor controls code expansion. +Oprefetch_latency=cycles +Oprefetch_latency applies to loops for which the compiler generates data prefetch instructions. cycles must be in the range of 0 to 10000. A value of zero instructs the compiler to use the default value, which is 480 cycles for loops containing floating-point accesses and 150 cycles for loops that do not contain any floating-point accesses. See the HP aC++ Online Programmer's Guide or HP C Online Help. +Oprocelim [+Onoprocelim] Enable [disable] the elimination of functions that are not referenced by the application. Only functions with the hidden export class may be eliminated. The default is +Onoprocelim at optimization levels 0 and 1; at levels 2, 3 and 4, the default is +Oprocelim. +Oprofile=collect +Oprofile=collect[:] Instrument the application for profile based optimization. The profile collection are: arc Enable collection of arc counts. dcache Enable collection of data cache misses. stride Enable collection of stride data. loopiter Enable collection of loop iteration counts. all Enable collection of all types of profile data. Equivalent to the command +Oprofile=collect:arc,dcache,stride,loopiter. For C, C++ the default is all, for Fortran90 the default is arc. +Oprofile=use Optimize the application based on profile data found in the database file flow.data, produced by compilation with +Oprofile=collect. +Opromote_indirect_calls [+Onopromote_indirect_calls] Enable [disable] the promotion of indirect calls to direct calls. (Indirect calls occur with pointers to functions.) This option can be used at optimization levels 3 and 4. The default is +Onopromote_indirect_calls. +Optrs_to_globals[=name1,name2,...,nameN +Onoptrs_to_globals[=name1,name2,...,nameN] Tell the optimizer whether global variables are modified [are not modified] through pointers. This optimization can occur at levels 2, 3, 4. The default is +Optrs_to_globals +Orecovery [+Onorecovery] Generate [do not generate] recovery code for control speculation. The default is +Orecovery. NOTE: For code which writes to uncacheable memory which may not be properly identified as volatile, the +Orecovery option reduces the risk of incorrect behavior. +Otype_safety=[off|limited|ansi|strong] +Onotypesafety Enable [disable] aliasing across types. off The default. Specifies that aliasing can occur freely across types. This is a synonym to +Onoptrs_ansi and +Onoptrs_strongly_typed options in cc. limited Code follows ANSI aliasing rules. Unnamed objects should be treated as if they had an unknown type. ansi Code follows ANSI aliasing rules, and unnamed objects should be treated the same as named objects. This option is synonym to +Optrs_ansi option in cc. strong Code follows ANSI aliasing rules, except that accesses through lvalues of a character type are not permitted to touch objects of other types. This assumes that field addresses are not taken. This option is synonym to +Optrs_strongly_typed option in cc. -Wl,-a, (linker option) -Wl,-aarchive_shared -Wl,-a,archive_shared Specify whether shared or archive libraries are searched with the -l option. The value of search should be one of archive, shared, archive_shared, shared_archive, or default. This option can appear more than once, interspersed among -l options, to control the searching for each library. The default is to use the shared version of a library if one is available, or the archive version if not. If either archive or shared is active, only the specified library type is accepted. If archive_shared is active, the archive form is preferred, but the shared form is allowed. If shared_archive is active, the shared form is preferred but the archive form is allowed. -Wl,+mergeseg (linker option) Sets a flag in the executable which causes the dynamic loader to merge all data segments of shared libraries loaded at startup time into one block. Data segments for each dynamically loaded library will also be merged with the data segments of dependent libraries. This increases run-time performance by allowing the kernel to use larger size page table entries. -Wl,+pd,size (linker option) Request a particular virtual memory page size that should be used for data. Sizes of 4K, 16K, 64K, 256K, 1M, 4M, 16M, 64M, 256M, D, and L are supported. A size of D allows the kernel to choose what page size should be used. A size of L results in using the largest page size available. The actual page size may vary if the requested size cannot be fulfilled. -Wl,+pi,size (linker option) Request a particular virtual memory page size that should be used for instructions. See the +pd option for additional information. ----------------------------------------------- Other descriptions ----------------------------------------------- -llapack Link in highly tuned math library functions found in the LAPACK library. B6061AA (HP MLIB) is an HP product which contains the LAPACK library, and is included in the HP-UX Technical Computing Operating Environment (TCOE). effmem.o Replacement for malloc/free that assumes ANSI compliance and improves spatial locality and minimizes memory usage by not maintaining a free list. fastmem.o Replacement for malloc/free that assumes ANSI compliance. ----------------------------------------------- Descriptions of Portability Flags ----------------------------------------------- +source={fixed|free|default} Accept source files in fixed format (+source=fixed) or free format (+source=free). The default, +source=default, is free for .f90 files and fixed for .f and .F source files. 176.gcc -DHOST_WORDS_BIG_ENDIAN : controls how bytes are numbered within a word. 181.mcf -DWANT_STDC_PROTO : allows use of the designated prototype. 186.crafty -DHP : selects header files and code paths compatible with HPUX. 252.eon -DFMAX_IS_DOUBLE : function fmax returns a double -DNDEBUG : do not include debug code -DSPEC_CPU2000_LP64 : use code to make longs and pointers 64 bit 253.perlbmk -DSPEC_CPU2000_HP : Compile the SPEC CPU2000 modified perl for an HPUX system. 254.gap: -DSYS_HAS_CALLOC_PROTO : allows use of the designated prototype -DSYS_HAS_IOCTL_PROTO : allows use of the designated prototype -DSYS_HAS_TIME_PROTO : allows use of the designated prototype -DSPEC_CPU2000_HP : selects header files and code paths compatible with HPUX. -DSYS_IS_USG : Compile for a USGish system. ----------------------------------------------- Descriptions of Kernel Tunables ----------------------------------------------- (Unless otherwise noted, units are in bytes) dbc_max_pct Maximum dynamic buffer cache size as a percent of system memory dbc_min_pct Minimum dynamic buffer cache size as a percent of system memory maxdsiz Maximum data size maxdsiz_64bit Maximum data size for 64 bit applications maxssiz Maximum stack size maxssiz_64bit Maximum stack size for 64 bit applications maxtsiz Maximum thread data size maxtsiz_64bit Maximum thread data size for 64 bit applications vps_ceiling Maximum System-Selected Page Size (in Kbytes) vps_pagesize Default user page size (in Kbytes) swapmem_on Swap to memory flag. ----------------------------------------------- Descriptions of Other Options and Commands ----------------------------------------------- mpsched (HP-UX utility) Control the processor or locality domain on which a specific process executes. tmplog (mount_vxfs option) In tmplog mode, the intent log is almost always delayed. This improves performance, but recent changes may disappear if the system crashes. This mode is only recommended for temporary file systems. cpuconfig on|off (EFI firmware command) Enables or disables the specified processor. parmodify (HP-UX partition manager utility) The parmodify command is used to modify the attributes of an existing partition. By default the target partition is the local partition. Either the -u or the -g option can be specified to allow this command to modify any other partition in the (local or remote) complex. This command can be used to set the amount of memory allocated to local memory for each cell in the system as follows: parmodify -p -m ::y::