------------------------------------------------------ Dell SPEC CPU2000 Flags Descriptions Intel C/C++/Visual FORTRAN Compilers Version 8.0 ------------------------------------------------------ ------------------------------------------------------ General Options (C/C++/FORTRAN) ------------------------------------------------------ -O{1|2|3} Optimization-level options: 1: optimize for speed, but disable some optimizations that increase code size for a small speed benefit. Includes inline expansion for intrinsic functions, global optimizations, string pooling optimizations. 2: optimizes for speed (DEFAULT). The -O2 option includes O1 optimizations and in addition enables inlining of intrinsics and more speed optimizations. 3: builds on -01 and -02 optimizations by enabling high-level optimization. This level does not guarantee higher performance unless loop and memory access transformation take place. In conjunction with -QaxK/-QxK and QaxW/QxW, this switch causes the compiler to perform more aggressive data dependency analysis than for -O2. This may result in longer compilation times. -Oa[-] Assume [do not assume] no aliasing in program. -Qax{K|W|N|B|P} (IA-32 only) Generates specialized code for processor-specific codes (see -Qx) while also generating generic IA-32 code. -Qx{K|W|N|B|P} (IA-32 only) Generates specialized code for processor specific codes: K: Streamng SIMD Extensions (SSE) W: Intel processor with Streamng SIMD Extensions 2 (SSE2) N: Intel processor with Streamng SIMD Extensions 2 (SSE2) B: Intel processor with Streamng SIMD Extensions 2 (SSE2) P: Intel processor with Streamng SIMD Extensions 3 (SSE3) Notes: -Qx{N|B|P} and -Qax{N|B|P}: These options also enable advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors. -Qx{N|B|P}: Programs, where the function main()is compiled with these options, will detect non-compatible processors and generate an error message during execution. -Qip Enable single-file IP optimizations (within files, same as -Ob2). -Qipo Enable multi-file IP optimizations (between files). - inline function expansion - interprocedural constant propogation - dead code elimination - propagation of function characteristics - passing arguments in registers - loop-invariant code motion -Qprof_gen Instrument program for profiling for the first phase of two-phase profile guided optimization. -Qprof_use Instructs the compiler to produce a profile-optimized executable and merges available dynamic information (.dyn) files into a pgopti.dpi file. If you perform multiple executions of the instrumented program, -Qprof_use merges the dynamic information files again and overwrites the previous pgopti.dpi file. Without any other options, the current directory is searched for .dyn files. -Qrcd The Intel compiler uses the -Qrcd option to improve the performance of code that requires floating-point-to-integer conversions. The system default floating point rounding mode is round-to-nearest. This means that values are rounded during floating point calculations. However, the C language requires floating point values to be truncated when a conversion to an integer is involved. To do this, the compiler must change the rounding mode to truncation before each floating point-to-integer conversion and change it back afterwards. The -Qrcd option disables the change to truncation of the rounding mode for all floating point calculations, including floating point-to-integer conversions. Turning on this option can improve performance, but floating point conversions to integer may not conform to C semantics. -Qunroll[n] Specifies the maximum number of times to unroll a loop. Omit n to let the compiler decide whether to perform unrolling or not. Use n = 0 to disable unroller. -Qwp_ipo Additionally to -Qipo, makes the whole program assumption that all variables and functions seen in compiled sources are referenced only within those sources; the user must guarantee that this assumption is safe. ------------------------------------------------------ Flags Specific to C/C++ ------------------------------------------------------ -G{5|6|7} Optimize code specifically for a targeted processor. Includes one or more of the following characters: 5: Pentium processors with or without MMX technology 6: Pentium Pro, Pentium II, and Pentium III processors 7: Pentium 4 processor -GR[-] Enables [disables (DEFAULT)] C++ Runtime Type Information (RTTI). -GX[-] Enables [disables (DEFAULT)] C++ exception handling. -Oi[-] Enables (DEFAULT) [disables] inline expansion of standard library functions. -Zp(1|2|4|8|16) Use the -Zp{n} option to determine the alignment constraint for structure declarations, on n-byte boundary (n = 1, 2, 4, 8, 16). Generally, smaller constraints result in smaller data sections while larger constraints support faster execution. For example, to specify 2 bytes as the alignment constraint for all structures and unions in the file prog1.c, use the following command: -Zp2 prog1.c The default is -Zp16. ------------------------------------------------------ Flags Specific to FORTRAN ------------------------------------------------------ -Ob{0|1|2} Controls the compiler's inline expansion. 0: disable inlining. 1: disables inlining unless /Qip or /Ob2 are specified. 2: enables inlining of any function. However, the compiler decides which functions are inlined. This option enables interprocedural optimizations and has the same effect as specifying the /Qip option. -Qauto Causes all variables to be allocated on the stack, rather than in local static storage. -Qprefetch[-] Enable [disable (DEFAULT)] prefetch insertion. The default with -O3 is -Qprefetch. -Qscalar_rep[-] Enables (DEFAULT) [disables] scalar replacement performed during loop transformations. ------------------------------------------------------ Miscellaneous ------------------------------------------------------ The starting tokens "/" and "-" are both equivalent for flags passed to the compiler. For example, -QxW and /QxW are identical switches. +FDO PASS1= -Qprof_gen PASS2= -Qprof_use Using feedback-directed optimization, a profile is generated on the first pass of compilation and used on the second pass. shlW32M.lib MicroQuill SmartHeap Library 7.1 available from www.microquill.com ------------------------------------------------------ Portability Options For CPU2000 ------------------------------------------------------ 176.gcc: -Dalloca=_alloca So as to use the built-in optimized alloca. /F10000000 176.gcc uses alloca and this option tells the linker to pre-allocate 10MB of stack. The default amount of stack allocated is not enough and 176.gcc crashes with a run-time error 178.galgel: /FI Fixed-format F90 source code. /F32000000 Same as with 176.gcc, pre-allocates a 32MB stack 186.crafty: -DNT_i386 Specifies that it is a Windows NT Intel processor-based system which makes the compiler use "_int64" as the 64-bit variable that 186.crafty needs. 253.perlbmk: -DSPEC_CPU2000_NTOS This enables the code changes for porting to Windows get included. -DPERLDLL On Windows, we need a perl.exe instead of a perl.exe and perl.dll. This pre-defines ensures that the changes necessary to get a single, UNIX-style executible without getting the indirect calls that can cause a 10% performance degradation. This allows the Windows-based executible to be as close as possible to the Unix-based one. /MT Use the static multi-threaded library else it will not compile. 254.gap: -DSYS_HAS_CALLOC_PROTO -DSYS_HAS_MALLOC_PROTO These two pre-defines tell of the existence of malloc and calloc prototypes.