Compilers: Intel C/C++/fortran Compiler
Operating systems: Linux
Selecting one of the following will take you directly to that section:
Adds the directory for include files to the search path at compile time.
Adds the directory for include files to the search path at compile time.
Adds the library directory search path at link time
Enable the compiler to generate multi-threaded code based on the OpenMP* directives (same as -fopenmp)
Enables OpenMP* offloading compilation for target pragmas. This
option only applies to Intel(R) MIC Architecture and Intel(R)
Graphics Technology. Enabled by default with -qopenmp.
Use -qno-openmp-offload to disable.
Specify kind to specify the default device for target pragmas
host - allow target code to run on host system while still doing
the outlining for offload
mic - specify Intel(R) MIC Architecture
gfx - specify Intel(R) Graphics Technology
Enable -O3 -no-prec-div -fp-model fast=2 optimizations.
optimize for maximum speed and enable more aggressive optimizations that may not improve performance on some programs
Code is optimized for Intel(R) processors with support for AVX2 instructions. The resulting code may contain unconditional use of features that are not supported on other processors. This option also enables new optimizations in addition to Intel processor-specific optimizations including advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors.
May generate Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2), Intel(R) AVX, SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, and SSE instructions for Intel(R) processors.
Do not use this option if you are executing a program on a processor that is not an Intel processor. If you use this option on a non-compatible processor to compile the main program (in Fortran) or the function main() in C/C++, the program will display a fatal run-time error if they are executed on unsupported processors.
Code is optimized for Intel(R) processors with support for CORE-AVX512 instructions. The resulting code may contain unconditional use of features that are not supported on other processors. This option also enables new optimizations in addition to Intel processor-specific optimizations including advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors.
Do not use this option if you are executing a program on a processor that is not an Intel processor. If you use this option on a non-compatible processor to compile the main program (in Fortran) or the function main() in C/C++, the program will display a fatal run-time error if they are executed on unsupported processors.
-qopt-zmm-usage=
Specifies the level of zmm registers usage. You can specify one of the following:
low - Tells the compiler that the compiled program is unlikely to benefit from zmm registers usage. It specifies that the compiler should avoid using zmm registers unless it can prove the gain from their usage.
high - Tells the compiler to generate zmm code without restrictions
May generate Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) Foundation instructions, Intel(R) AVX-512 Conflict Detection instructions, as well as the instructions enabled with CORE-AVX2. Optimizes for Intel(R) processors that support Intel(R) AVX-512 instructions.
May generate Intel® AVX2, AVX, Intel® SSE4.2, SSE4.1, SSSE3, SSE3, SSE2 and SSE instructions /arch:core-avx2 is supported on Windows* but -mcore-avx2 is not supported for Linux* or macOS* (use -march=core-avx2 instead)
-fimf-precision=value[:funclist]
defines the accuracy (precision) for math library functions
value - defined as one of the following values
high - equivalent to max-error = 0.6
medium - equivalent to max-error = 4 (DEFAULT)
low - equivalent to accuracy-bits = 11 (single
precision); accuracy-bits = 26 (double
precision)
funclist - optional comma separated list of one or more math
library functions to which the attribute should be
applied
Specifies whether streaming stores are generated:
always - enables generation of streaming stores under the assumption that the application is memory bound
auto - compiler decides when streaming stores are used (DEFAULT)
never - disables generation of streaming stores
Enable levels of prefetch insertion, where 0 disables. n may be 0 through 5 inclusive. Default is 2.
-prec-sqrt improves precision of floating-point square root. It has a slight impact on speed. -no-prec-sqrt disables this option and enables optimizations that give slightly less precise results than full IEEE division.
Determine if certain square root optimizations are enabled.
-prec-div improves precision of floating-point divides. It has a slight impact on speed. -no-prec-div disables this option and enables optimizations that give slightly less precise results than full IEEE division.
Enable/disable(DEFAULT) use of ANSI aliasing rules in optimizations; user asserts that the program adheres to these rules.
-ipo[n]
Multi-file ip optimizations that includes:
- inline function expansion
- interprocedural constant propogation
- dead code elimination
- propagation of function characteristics
- passing arguments in registers
- loop-invariant code motion
(n - number of multi-file objects)
enable floating point model variation
[no-]except - enable/disable floating point semantics
fast[=1|2] - enables more aggressive floating point optimizations
precise - allows value-safe optimizations
source - enables intermediates in source precision
strict - enables -fp-model precise -fp-model except, disables
contractions and enables pragma stdc fenv_access
double - rounds intermediates in 53-bit (double) precision
extended - rounds intermediates in 64-bit (extended) precision
FPORTABILITY flag
No Fortran main method exists, use C equivalent instead.
Enables the use of nested SIMD statements for OpenMP.
Invoke the Intel C compiler.
Invoke the Intel C++ compiler.
Invoke the Intel Fortran compiler.
Invoke the Intel C compiler.
Invoke the Intel C++ compiler.
Invoke the Intel Fortran compiler.
Link using FFTW 3.3.6 library for Linux. Description from FFTW:
FFTW lib compiled with -O3 -xCORE-AVX2
FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST).
Operating System Tuning settings
One or more of the following settings may have been applied to the testbed. If so, the "Platform Notes" section of the report will say so; and you can read below to find out more about what these settings mean.
LD_LIBRARY_PATH=<directories> (linker)
LD_LIBRARY_PATH controls the search order for both the compile-time and run-time linkers. Usually, it can be defaulted; but testers may sometimes choose to explicitly set it (as documented in the notes in the submission), in order to ensure that the correct versions of libraries are picked up.
STACKSIZE=<n>
Set the size of the stack (temporary storage area) for each slave thread of a multithreaded program.
ulimit -s <n>
Sets the stack size to n kbytes, or "unlimited" to allow the stack size to grow without limit.