The following compiler and system parameter ("flag") description covers the Fujitsu Siemens PrimePower CPU2000 results published from Sep-2001 on, until overridden by a more recent flag description. It is identical to the Fujitsu Ltd. flag description contained in http://www.spec.org/osg/cpu2000/flags/FUJITSU-PRIMEPOWER-20010814.txt ================================================== -------------------------------------------------------------------------------- Fujitsu Parallelnavi 1.0.2 compiler flag description Compiler options Remark -------------------------------------------------------------------------------- -Am Required if a source file contains modules which will be referenced by USE statements in other source files or if a source file contains USE statements that reference modules in another source file. -dy/-dn Specifies dynamic(-dy) or static(-dn) linkage of libraries. -dy is the default unless -Kfast_GP=n (n>=3) is specified and -Klargepage is not specified, in that case -dn is the default. -Fixed Specifies that Fortran source programs are written in fixed source form. -f omitmsg Set the level of diagnostic messages output and inhibit specific messages. omitmsg is one of i, w, or s, and/or a list of msgnum. i : All messages are output, this is the default. w : i level messages are not output. s : i and w level messages are not output. msgnum : Message number msgnum is inhibited. msgnum must be an i or w level message. -Kbcopy Convert memory copy loop to memmove or memcpy function. -Kcfunc This uses high speed mathematical functions and library functions (malloc,calloc,realloc,free) prepared by this compilation system. -Kcommonpad[=N] Insert padding elements in common blocks for efficient use of cache. N can be specified from 4 to 4096. -Kcond This generates conditional-move instruction. This option is effective only if -Kcrossfile option is also specified. -Kcrossfile This option specifies crossfile optimization. If program consists of several files, the compiler refers these files at one time, and analyzes data dependency and control relation across these files. -Kfrecipro This option specifies to perform an optimization of floating point division making use of the reciprocal. -Kfuse Fuses neighboring loops. -Kgs Performs global instruction scheduling. -Kpreex This option specifies the optimization by moving the evaluation of invariant expressions beyond branch. -Kilfunc This option replaces several and double precision mathematical functions,sin,cos,log10,log and exp with complier builtin functions. -KGREG The global registers g2 through g7 are subject to register allocation in the compile stage. -Kfuncalign=n Adjust entry of functional alignment at n-byte boundary. -Kfast_GP[={0|1|2|3|4|5}] This performs optimization for SPARC64 GP series. 0: This performs optimization suitable for SPARC64 GP. 1: This generates multiply and add instruction in addition to -Kfast_GP=0. (default) 2: This performs reordering of expression evaluation in addition to -Kfast_GP=1. 3: This generates crossfile optimization, and inter- procedural optimization in addition to -Kfast_GP=2. 4: This generates advanced branch optimization in addition to -Kfast_GP=3. 5: This generates global instruction scheduling optimization suit for scientific application in addition to -Kfast_GP=3. -Klargepage Specifies to generate executable program which utilizes Parallelnavi largepage facility. -Knoalias This option specifies that pointer variables do not share memory area with other variables. -Knoiopt This option prevents interprocedural optimization. -KNOFLTLD[=N] This option generates non-faulting load instruction. When non-faulting load instruction is used, signal does not occur even if given address is invalid. Memory access can be done before the memory address validation by using this option. N is maximum number of non-faulting load instruction for each loop or branch instruction. The compiler determine the default value of N when omitted N can be a number from 1 to 255. -Konefile Generate only one temporary file (.s file or .o file) when multiple source files are compiled in one compilation. -Kpopt This option specifies the optimization data pointed to by pointers using a limited interpretation that the areas referred by pointers are only referred by pointers. -Kpg Generates instructions to produce a profile file for subsequent optimization (global instruction scheduling etc.). -Kpu[=file] This performs optimization (global instruction scheduling, etc.) using program runtime profile information obtained by specifying -Kpg option. -Kpreload Moves load instructions across branches. -Kmemalias In case of indirect memory access through pointer, when the accessing types are different, no memory aliase is assumed. -Kprefetch[={1|2|3|4}] Generate prefetch instruction correspond to each prefetch level. 1: Basic level prefetch for array elements only inner-most loop. 2: In addition to the -Kprefetch=1, generates the prefetch instruction for array elements within the loop pre-header which access the first iteration in the loop. 3: In addition to the -Kprefetch=2, when the stride of access for array elements are larger than cache line size, compiler generates prefetch instruction for each cache line size access. 4: Maximum level for generating prefetch instruction. In addition to -Kprefetch=3, compiler generates the prefetch instruction for array elements which access in the outer loop. -Kprefetch_iteration=N Generate the prefetch instruction of the data which is referred after N iterations. -Knoprefetch Suppresses use the prefetch instruction. -Krestp This option specifies arguments optimization based on ANSI (C Standard). -Kstaticclump This option specifies to gather all static and global variables except large array to access with same base address and index. -Kswitchopt This option specifies to change case organization by execution ratio of each case statements. -Kunroll[=N] Performs loop unrolling. N means upper limit of unrolling expansion number, whose value should be from 2 to 9999. -Knounroll Prevents loop unrolling optimizations. -Kuse_rodata This option specifies whether string constant, floating point constant and initialization value of aggregate type local storage variable are allocated to read-only data section. -Kxi=N Inline expansion, instead of function calls, is performed. Expanded function is selected by result of profiler. N is the percentage that means increased object size. -O[level] Specifies the optimization level. 0: No optimization. 1: Basic optimization. 2: Loop unrolling in addition to -O1. 3: Global instruction scheduling and restructuring of nested loop in addition to -O2. 4: Enhanced optimization of loop restructuring rather than -O3. -x- Inline expansion, instead of function calls, is performed for all functions defined in the C source code. -x stm_no Applying optimization for inline expansion of user-defined external procedure having fewer than specified number of execution statements in the stm_no arguments. -x dir=directory_name Performs inline expansion of procedures defined in the files under the directory specified and in the file currently being compiled. -Xa The C language specifications used at compile time is based on the specifications of the C language standard (ANSI standard). This is the default mode. -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Sun Forte Developer 6 update 2 flag description Compiler Options Remark -------------------------------------------------------------------------------- -D Set definition for preprocessor. -dalign Assume double-type data is double aligned. -dn Specify static binding. -e Accept extended (132 character) input source lines (FORTRAN). -fast This is a convenience option for selecting a set of optimizations for performance, and it chooses: o The -native best machine characteristics option (-xarch=native, -xchip=native, -xcache=native) o Optimization level: -xO5 o A set of inline expansion templates (-libmil) o The -fsimple=2 option o The -dalign option o The -xalias_level=basic option (C only) o The -xlibmopt option o The -xdepend option (FORTRAN only) o The -xprefetch option (FORTRAN only) o Options to turn off all trapping (-fns -ftrap=%none) -fixed Accept fixed-format input source files (FORTRAN). -fns Select non-standard floating point mode. This flag causes the nonstandard floating point mode to be enabled when a program begins execution. By default, the nonstandard floating point mode will not be enabled automatically. On some SPARC systems, the nonstandard floating point mode disables "gradual underflow", causing tiny results to be flushed to zero rather than producing subnormal numbers. It also causes subnormal operands to be silently replaced by zero. On those SPARC systems that do not support gradual underflow and subnormal numbers in hardware, use of this option can significantly improve the performance of some programs. Warning: When nonstandard mode is enabled, floating point arithmetic may produce results that do not con- form to the requirements of the IEEE 754 standard. See the Numerical Computation Guide for more information. -fsimple=0 Permits no simplifying assumptions. Preserves strict IEEE 754 conformance. -fsimple=1 With -fsimple=1, the optimizer can assume the following: o The IEEE 754 default rounding/trapping modes do not change after process initialization. o Computations producing no visible result other than potential floating-point exceptions may be deleted. o Computations with Infinity or NaNs as operands need not propagate NaNs to their results. For example, x*0 may be replaced by 0. o Computations do not depend on sign of zero. -fsimple=2 Permits aggressive floating point optimizations that may cause programs to produce different numeric results due to changes in rounding. Even with -fsimple=2, the optimizer still is not permitted to introduce a floating point exception in a program that otherwise produces none. -fsimple[=n] Allows the compiler to make simplifying assumptions concerning floating-point arithmetic. -ftrap=t Sets the IEEE 754 trapping mode in effect at startup. t is a comma-separated list that consists of one or more of the following: %all, %none, common, [no%]invalid, [no%]overflow, [no%]underflow, [no%]division, [no%]inexact. The default is -ftrap=%none. This option sets the IEEE 754 trapping modes that are established at program initialization. Processing is left-to-right. The common exceptions, by definition, are invalid, division by zero, and overflow. o %none, the default, turns off all trapping modes. Do not use this option for programs that depend on IEEE standard exception handling; you can get different numerical results, premature program termination, or unexpected SIGFPE signals. -inline=%auto Enables automatic inlining -libmil Use inline expansion templates for libm. -library=iostream Use "classic" (pre 1998 C++ standard) iostream library Prior to the C++ standard (1998), there was one iostream library, what is now often called "classic" iostreams. The C++ standard defines a different, but similar, iostream library, which we call "standard" iostreams. To get classic iostreams in standard (default) mode, you use the option "-library=iostream". -library=iostream,no%Cstd Same as -library=iostream plus -library=no%Cstd You can combine two options into one: -library=iostream,no%Cstd -library=no%Cstd Do not find the standard library headers or the runtime library itself, in order to avoid mixing the standard library with classic iostreams. Some of the remainder of the C++ standard library (in particular, the standard string class) relies on using standard iostreams. If you attempt to use those library features with classic iostreams, the code will not work. To ensure that you don't attempt to mix the standard library with classic iostreams, you can use the option -library=no%Cstd. With that option, the compiler does not find the standard library headers or the runtime library itself. -lm Link with math library -lmopt This chooses the math library that is optimized for speed -native Select native machine characteristics for optimization. -Qoption Pass flags along to compiler phase: f90comp Fortran first pass iropt Global optimizer cg Code generator -Qoption cg See -Wc, below. -Qoption cg -Qgsched-T4 See -Wc,-Qgsched-T4 -Qoption cg -Qgsched-trace_late=1 See -Wc,-Qgsched-trace_late=1 -Qoption iropt -Mt See -W2,-Mt -Qoption iropt -restrict See -W2,-O4+restrict -Qoption iropt -restrict_g See -W2,-O4+restrict_g -Qoption f90comp -array_pad_rows, Enable padding of f90 arrays by n. -Qoption f90comp -expansion Enable f90 array expansion. -Qoption iropt +ansi_alias assume (more restrictive) ANSI C semantics for pointer aliasing -Qoption iropt See -W2, below. -Qoption iropt -Adata_access enable optimizations based on data access patterns -Qoption iropt -Amemopt See -W2,-Amemopt -Qoption iropt bmerge enable branch merge optimizations -Qoption iropt -Ma See -W2,-Ma -Qoption iropt -Mm See -W2,-Mm -Qoption iropt -Mr See -W2,-Mr -Qoption iropt -O4+algassoc enable floating point reassociation -Qoption iropt -O4+bcopy allows replacing copy and memset loops with library calls -Qoption iropt -O4+scalarrep disable scalar replacement optimization -Qoption iropt -whole See -W2,-whole -stackvar Allocate routine local variables on stack (FORTRAN). -W, Pass flags along to compiler phase: 2 global optimizer c code generator -W2,-Abopt Enable aggressive optimizations of all branches. -W2,-Adata_access Enable optimizations based on data access patterns. -W2,-Aheap Allows the compiler to recognize malloc-like memory allocation functions. -W2,-Amemopt Memory access optimization. This does whole-program mode inter-procedural memory access analysis, merges memory allocations, and performs cache conscious data layout program transformations. -W2,-Aunroll Enables outer-loop unrolling. -W2,-crit Enable optimization of critical control paths -W2,-Ma Enable inlining of routines with frame size up to n. -W2,-Mm Maximum module increase limit for inlining. -W2,-Mp Procedures with entry counts equal or greater than n become candidates for inlining. -W2,-Mr Maximum code increase due to inlining is limited to n triples. -W2,-Ms Maximum level of recursive inlining. -W2,-Mt The maximum size of a routine body eligible for inlining is limited to n triples. -W2,-O4+ansi_alias Assume (more restrictive) ANSI C semantics for pointer aliasing. -W2,-O4+restrict This tells the compiler to assume that different pointer-type formal parameters point to their own memory locations (C restricted pointers) -W2,-O4+restrict_g This tells the compiler to assume that different global pointer variables point to their own memory locations. -W2,-reroll=1 Turns on loop rerolling. -W2,-whole Do whole program optimizations. -Wc,-Qdepgraph-early_cross_call=1 Enable early cross-call instruction scheduling. -Wc,-Qgsched-T4 Sets the aggressiveness of the trace formation. -Wc,-Qgsched-trace_late=1 Turns on the late trace scheduler. -Wc,-Qgsched-trace_spec_load=1 Turns on the conversion of loads to non-faulting loads inside the trace. -Wc,-Qiselect-funcalign= Do function entry alignment at n-byte boundaries. -Wc,-Qpeep-Sh0 Disables the max live base registers algorithm for sethi hoisting. -Xa Assume ANSI C conformance, allow K & R extensions. (default mode) -xalias_level= Allows compiler to perform type-based alias analysis at the given alias level. basic assume ISO C9X aliasing rules for basic types only. std assume ISO C9X aliasing rules. strong assume all pointers are type safe (strongly typed). -xarch= Limit the set of instructions the compiler may use. -Xc Assume strict ANSI C conformance. -xcache= Defines the cache properties for use by the optimizer. c must be one of the following: o native (set parameters for the host environment) o s1/l1/a1 o s1/l1/a1:s2/l2/a2 o s1/l1/a1:s2/l2/a2:s3/l3/a3 The si/li/ai are defined as follows: si The size of the data cache at level i, in kilobytes. li The line size of the data cache at level i, in bytes. ai The associativity of the data cache at level i. -xchip= Specifies the target processor for use by the optimizer. c must be one of: generic, generic64, native, native64, old, super, super2, micro, micro2, hyper, hyper2, powerup, ultra, ultra2, ultra2i, ultra3, 386, 486, pentium, pentium_pro, 603, 604. -xcrossfile Enable cross-file inlining. -xdepend Analyze loops for data dependencies. -xO1 Does basic local optimization (peephole). -xO2 xO1 and more local and global optimizations. -xO3 Besides what xO2 does, it optimizes references or definitions for external variables. Loop unrolling and software pipelining are also performed. -xO4 xO3 plus function inlining. -xO5 Besides what xO4 does, it enables speculative code motion. -xpad=common[:] Pad common block variables, for better use of cache. n specifies the amount of padding to apply. If no parameter is specified then the compiler selects one automatically. -xpad=local[:] Pad local variables only, for better use of cache. n specifies the amount of padding to apply. If no parameter is specified then the compiler selects one automatically. -xparallel Use parallel processing to improve performance. -xprefetch Enable generation of prefetch instructions. -xprofile See below -xprofile=collect Collect profile data for feedback directed optimizations. -xprofile=use Use data collected for profile feedback. -xreduction Parallelize loops containing reductions. -xregs=syst Allows use of the system reserved registers %g6 and %g7, and %g5 if not already allowed by -xarch value. -xrestrict[=f1,...,f2,%all, Treat pointer-valued function parameters %none] as restricted pointers. The default is %none. Specifying -xrestrict is equivalent to specifying -xrestrict=%all. -xsafe=mem Enables the use of non-faulting loads when used in conjunction with -xarch=v8plus is set, assumes that no memory based traps will occur. -Xt Assume K & R conformance, allow ANSI C. -xvector Enable vectorization of loops with calls to math routines. -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- /etc/system (system configuration information file) description System Tunables Remark -------------------------------------------------------------------------------- consistent_coloring Controls the page coloring policy. It can be set to one of the following: 0: (default) dynamic (uses various vaddr bits) 1: static (virtual=paddr) tune_t_fsflushr The number of seconds between fsflush invocations for checking dirty memory. autoup The frequency of file system sync operations. shmsys:shminfo_shmmax Maximum size of system V shared memory segment that can be created. shmsys:shminfo_shmmni System wide limit on number of shared memory segments that can be created. shmsys:shminfo_shmseg Limit on the number of shared memory segments that any one process can create. -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- /etc/opt/FJSVpnrm/lpg.conf (Large page management information file) description Tunables Remark -------------------------------------------------------------------------------- TSS=size[unit] Size of total memory, to be used for large page segments. At start of the system, this amount of memory is reserved and initialized. "unit" can be M for mega-byte and G for giga-byte. SHMSEGSIZE=size[unit] Size of large page segment. "unit" can be M for mega-byte and G for giga-byte. --------------------------------------------------------------------------------