Intel 8.0 Linux Compiler Option List ------------------------------------ Performance ----------- -O1 enable optimizations (DEFAULT),optimizes for speed. for Itanium compiler, -O1 turns off software pipelining to reduce code size. -O2 same as -O1 on IA-32. Same as -O on Itanium-based systems. -O3 enable -O2 plus more aggressive optimizations that may increase the compilation time and may not improve performance for all programs -O0 disable optimizations -O same as -O1 -fast enable -O3 -ipo -static -Ob control inline expansion: n=0 disables inlining n=1 inline functions declared with __inline, and perform C++ inlining n=2 inline any function, at the compiler's discretion (same as -Qip) -falias assume aliasing in program (DEFAULT) -fno-alias assume no aliasing in program -ffnalias assume aliasing within functions (DEFAULT) -fno-fnalias assume no aliasing within functions, but assume aliasing across calls -nolib_inline disable inline expansion of intrinsic functions -mp maintain floating point precision (disables some optimizations) -mp1 improve floating-point precision (speed impact is less than -mp) -fp disable using EBP as general purpose register -prec_div improve precision of floating-point divides (some speed impact) -fp_port round fp results at assignments & casts (some speed impact) -fpstkchk enable fp stack checking after every function/procedure call -pc32 set internal FPU precision to 24 bit significand -pc64 set internal FPU precision to 53 bit significand -pc80 set internal FPU precision to 64 bit significand (DEFAULT) -rcd rounding mode to enable fast float-to-int conversions -tpp5 optimize for Pentium(R) processor -tpp6 optimize for Pentium(R) Pro, Pentium(R) II and Pentium(R) III processors -tpp7 optimize for Pentium(R) 4 processor (DEFAULT) -mcpu= optimize for a specific cpu pentium - optimize for Pentium(R) processor pentiumpro - optimize for Pentium(R) Pro, Pentium(R) II and Pentium(R) III processors pentium4 - optimize for Pentium(R) 4 processor (DEFAULT) -ax generate code specialized for processors specified by while also generating generic IA-32 code. includes one or more of the following characters: K Intel Pentium III and compatible Intel processors W Intel Pentium 4 and compatible Intel processors N Intel Pentium 4 and compatible Intel processors P Intel processors code-named Prescott B Intel Pentium M and compatible Intel processors -x generate specialized code to run exclusively on processors indicated by as described above. -march= generate code excusively for a given pentiumpro - Pentium(R) Pro and Pentium(R) II processor instructions pentiumii - MMX(TM)instructions pentiumiii - streaming SIMD extensions pentium4 - Pentium(R) 4 New Instructions Advanced Performance -------------------- Enable and specify the scope of Interprocedural (IP) Optimizations: -ip enable single-file IP optimizations (within files) -ipo enable multi-file IP optimizations (between files) -ipo_c generate a multi-file object file (ipo_out.o) -ipo_S generate a multi-file assembly file (ipo_out.s) Modify the behavior of IP: -ip_no_inlining disable full and partial inlining (requires -ip or -ipo) -ip_no_pinlining disable partial inlining (requires -ip or -ipo) -ipo_obj force generation of real object files (requires -ipo) Other Advanced Performance Options: -unroll[n] set maximum number of times to unroll loops. Omit n to use default heuristics. Use n=0 to disable loop unroller. -unroll [n] set maximum number of times to unroll loops. Omit n to use default heuristics. Use n=0 to disable loop unroller. -prof_dir specify directory for profiling output files (*.dyn and *.dpi) -prof_file specify file name for profiling summary file -prof_gen[x] instrument program for profiling; with the x qualifier, extra information is gathered -prof_use enable use of profiling information during optimization -qp compile and link for function profiling with UNIX gprof tool -p same as -qp -prefetch[-] enable(DEFAULT)/disable prefetch insertion -vec_report[n] control amount of vectorizer diagnostic information: n=0 no diagnostic information n=1 indicate vectorized loops (DEFAULT) n=2 indicate vectorized/non-vectorized loops n=3 indicate vectorized/non-vectorized loops and prohibiting data dependence information n=4 indicate non-vectorized loops n=5 indicate non-vectorized loops and prohibiting data dependence information -opt_report generate an optimization report to stderr -opt_report_file specify the filename for the generated report -opt_report_level[level] specify the level of report verbosity (min|med|max) -opt_report_phase specify the phase that reports are generated against -opt_report_routine reports on routines containing the given name -opt_report_help display the optimization phases available for reporting -tcheck generate instrumentation to detect multi-threading bugs (requires Intel(R) Threading Tools; cannot be used with compiler alone) -openmp enable the compiler to generate multi-threaded code based on the OpenMP directives -openmp_profile link with instrumented OpenMP runtime library to generate OpenMP profiling information for use with the OpenMP component of the VTune(TM) Performance Analyzer -openmp_stubs enables the user to compile OpenMP programs in sequential mode. The openmp directives are ignored and a stub OpenMP library is linked (sequential) -openmp_report{0|1|2} control the OpenMP parallelizer diagnostic level -parallel enable the auto-parallelizer to generate multi-threaded code for loops that can be safely executed in parallel -par_report{0|1|2|3} control the auto-parallelizer diagnostic level -par_threshold[n] set threshold for the auto-parallelization of loops where n is an integer from 0 to 100 -alias_args[-] enable(DEFAULT)/disable C/C++ rule that function arguments may be aliased; when disabling the rule, the user asserts that this is safe -ansi_alias[-] enable/disable(DEFAULT) use of ANSI aliasing rules in optimizations; user asserts that the program adheres to these rules -complex_limited_range[-] enable/disable(DEFAULT) the use of the basic algebraic expansions of some complex arithmetic operations. This can allow for some performance improvement in programs which use a lot of complex arithmetic at the loss of some exponent range. Portability flags for SPEC CPU2000: -Dalloca=_alloca Replace occurrences of alloca() with _alloca. -DLINUX_i386 Linux Intel system, use "long long" as 64bit variable. -DHAS_ERRLIST Prog env provides specification for "sys_errlist[]". -DSPEC_CPU2000_LINUX_I386 Enable the code changes for porting to Linux on i386 architecture to be utilized -DUSG Specify that the programming environment is like System V Unix systems -DSPEC_CPU2000_NEED_BOOL Use SPEC provided definition of the boolean type. -DPSEC_CPU2000_GLIBC22 Compatibility with 2.2 & later versions of glibc -DSYS_IS_USG Specifies that the operating system is USG compliant. -DSYS_HAS_TIME_PROTO Do not explicitly declare time(). -DSYS_HAS_SIGNAL_PROTO Do not explicitly #include -DSYS_HAS_IOCTL_PROTO Do not explicitly declare ioctl(). -DSYS_HAS_ANSI System is ANSI compliant. -DSYS_HAS_CALLOC_PROTO Do not explicitly declare calloc().