SPEC Accel OpenMP Flag Description for the Intel(R) C/C++ Compiler for IA32 and Intel 64 applications and Intel(R) Fortran Compiler for IA32 and Intel 64 applications

Optimization Flags

- -Istd
- -I.?\s*[^ ]*include[^ ]*
- Adds the directory for include files to the search path at compile time.
- -Istdi
- -I.?\s
- Adds the directory for include files to the search path at compile time.
- -Lstd
- -L\s*[^ ]*[^ ]*
- Adds the library directory search path at link time
- -qopenmp
- -qopenmp(?=\s|$)
- Enable the compiler to generate multi-threaded code based on the OpenMP* directives (same as -fopenmp)
- -qopenmp-offload
- -qopenmp-offload=(host|mic|gfx)(?=\s|$)
- Enables OpenMP* offloading compilation for target pragmas. This option only applies to Intel(R) MIC Architecture and Intel(R) Graphics Technology. Enabled by default with -qopenmp. Use -qno-openmp-offload to disable.
  Specify kind to specify the default device for target pragmas
  host - allow target code to run on host system while still doing the outlining for offload
  mic - specify Intel(R) MIC Architecture
  gfx - specify Intel(R) Graphics Technology
- -Ofast
- -Ofast(?=\s|$)
- Enable -O3 -no-prec-div -fp-model fast=2 optimizations.
- -O3
- -O3(?=\s|$)
- optimize for maximum speed and enable more aggressive optimizations that may not improve performance on some programs
- -xCORE-AVX2
- -xCORE-AVX2(?=\s|$)
- Code is optimized for Intel(R) processors with support for AVX2 instructions. The resulting code may contain unconditional use of features that are not supported on other processors. This option also enables new optimizations in addition to Intel processor-specific optimizations including advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors.
  
  May generate Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2), Intel(R) AVX, SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, and SSE instructions for Intel(R) processors.
  
  Do not use this option if you are executing a program on a processor that is not an Intel processor. If you use this option on a non-compatible processor to compile the main program (in Fortran) or the function main() in C/C++, the program will display a fatal run-time error if they are executed on unsupported processors.
- -xCORE-AVX512
- -xCORE-AVX512(?=\s|$)
- Code is optimized for Intel(R) processors with support for CORE-AVX512 instructions. The resulting code may contain unconditional use of features that are not supported on other processors. This option also enables new optimizations in addition to Intel processor-specific optimizations including advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors.
  
  Do not use this option if you are executing a program on a processor that is not an Intel processor. If you use this option on a non-compatible processor to compile the main program (in Fortran) or the function main() in C/C++, the program will display a fatal run-time error if they are executed on unsupported processors.
- -qopt-zmm-usage
- -qopt-zmm-usage=(low|high)(?=\s|$)
- -qopt-zmm-usage=
  Specifies the level of zmm registers usage. You can specify one of the following:
  low - Tells the compiler that the compiled program is unlikely to benefit from zmm registers usage. It specifies that the compiler should avoid using zmm registers unless it can prove the gain from their usage.
  high - Tells the compiler to generate zmm code without restrictions
- -xCOMMON-AVX512
- -xCOMMON-AVX512(?=\s|$)
- May generate Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) Foundation instructions, Intel(R) AVX-512 Conflict Detection instructions, as well as the instructions enabled with CORE-AVX2. Optimizes for Intel(R) processors that support Intel(R) AVX-512 instructions.
- -marchcore-avx2
- -march=core-avx2(?=\s|$)
- May generate Intel� AVX2, AVX, Intel� SSE4.2, SSE4.1, SSSE3, SSE3, SSE2 and SSE instructions /arch:core-avx2 is supported on Windows* but -mcore-avx2 is not supported for Linux* or macOS* (use -march=core-avx2 instead)
- -fimf-precision
- -fimf-precision=(high|medium|low):([a-z\,/]+)(?=\s|$)
- -fimf-precision=value[:funclist]
  defines the accuracy (precision) for math library functions
  value - defined as one of the following values
  high - equivalent to max-error = 0.6
  medium - equivalent to max-error = 4 (DEFAULT)
  low - equivalent to accuracy-bits = 11 (single precision); accuracy-bits = 26 (double precision)
  funclist - optional comma separated list of one or more math library functions to which the attribute should be applied
- -qopt-streaming-stores
- -qopt-streaming-stores.always(?=\s|$)
- Specifies whether streaming stores are generated:
  
  always - enables generation of streaming stores under the assumption that the application is memory bound
  
  auto - compiler decides when streaming stores are used (DEFAULT)
  
  never - disables generation of streaming stores
- -qopt-prefetch
- -qopt-prefetch=([0-5])(?=\s|$)
- Enable levels of prefetch insertion, where 0 disables. n may be 0 through 5 inclusive. Default is 2.
- -no-prec-sqrt
- -no-prec-sqrt(?=\s|$)
- -prec-sqrt improves precision of floating-point square root. It has a slight impact on speed. -no-prec-sqrt disables this option and enables optimizations that give slightly less precise results than full IEEE division.
- -qopt-multiple-gather-scatter-by-shuffles
- -qopt-multiple-gather-scatter-by-shuffles(?=\s|$)
- Determine if certain square root optimizations are enabled.
- -no-prec-div
- -no-prec-div(?=\s|$)
- -prec-div improves precision of floating-point divides. It has a slight impact on speed. -no-prec-div disables this option and enables optimizations that give slightly less precise results than full IEEE division.
- -ansi-alias
- -ansi-alias(?=\s|$)
- Enable/disable(DEFAULT) use of ANSI aliasing rules in optimizations; user asserts that the program adheres to these rules.
- -ipo
- -ipo(?=\s|$)
- -ipo[n]
  
  Multi-file ip optimizations that includes:
  
  - inline function expansion
  
  - interprocedural constant propogation
  
  - dead code elimination
  
  - propagation of function characteristics
  
  - passing arguments in registers
  
  - loop-invariant code motion
  (
  n - number of multi-file objects)
- -fp-model
- -fp-model\s(except|no\-except|fast\=(1|2)|precise|source|strict|double|extended)(?=\s|$)
- enable floating point model variation
  
  [no-]except - enable/disable floating point semantics
  
  fast[=1|2] - enables more aggressive floating point optimizations
  
  precise - allows value-safe optimizations
  
  source - enables intermediates in source precision
  
  strict - enables -fp-model precise -fp-model except, disables
  
  contractions and enables pragma stdc fenv_access
  
  double - rounds intermediates in 53-bit (double) precision
  
  extended - rounds intermediates in 64-bit (extended) precision

- -port_80
- -80
- FPORTABILITY flag
- -port_noformain
- -nofor-main
- No Fortran main method exists, use C equivalent instead.
- -declare_use_inner_simd
- -DSPEC_USE_INNER_SIMD
- Enables the use of nested SIMD statements for OpenMP.

Compiler Flags

-intel_cc
-intel_CC
-intel_f90
-intel_icx
-intel_icpx
-intel_ifx

- -intel_cc
- (?:/\S+/)?icc\b
- Invoke the Intel C compiler.
- -intel_CC
- (?:/\S+/)?icpc(?=\s|$)
- Invoke the Intel C++ compiler.
- -intel_f90
- (?:/\S+/)?ifort\b
- Invoke the Intel Fortran compiler.
- -intel_icx
- (?:/\S+/)?icx\b
- Invoke the Intel C compiler.
- -intel_icpx
- (?:/\S+/)?icpx(?=\s|$)
- Invoke the Intel C++ compiler.
- -intel_ifx
- (?:/\S+/)?ifx\b
- Invoke the Intel Fortran compiler.

Other Flags

-lfftw3

- -lfftw3
- -lfftw3(?=\s|$)
- Link using FFTW 3.3.6 library for Linux. Description from FFTW:
  
  FFTW lib compiled with -O3 -xCORE-AVX2
  
  FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST).

Shell, Environment, and Other Software Settings

One or more of the following settings may have been applied to the testbed. If so, the "Platform Notes" section of the report will say so; and you can read below to find out more about what these settings mean.

LD_LIBRARY_PATH=<directories> (linker)
LD_LIBRARY_PATH controls the search order for both the compile-time and run-time linkers. Usually, it can be defaulted; but testers may sometimes choose to explicitly set it (as documented in the notes in the submission), in order to ensure that the correct versions of libraries are picked up.

STACKSIZE=<n>
Set the size of the stack (temporary storage area) for each slave thread of a multithreaded program.