Compilers: Intel(R) oneAPI DPC++/C++ Compiler for applications running on Intel(R) 64, Version 2023.1.0 Build 20230320
Operating systems: Linux
Last updated: 20-Mar-2023
The text for many of the descriptions below was taken from the "icc/icx --help".
Copyright © 1985-2021 Intel Corporation. All Rights Reserved.
Selecting one of the following will take you directly to that section:
Adds the directory for include files to the search path at compile time.
Adds the directory for include files to the search path at compile time.
Adds the library directory search path at link time
Enable the compiler to generate multi-threaded code based on the OpenMP* directives (same as -fopenmp)
Enables OpenMP* offloading compilation for target pragmas. This
option only applies to Intel(R) MIC Architecture and Intel(R)
Graphics Technology. Enabled by default with -qopenmp.
Use -qno-openmp-offload to disable.
Specify kind to specify the default device for target pragmas
host - allow target code to run on host system while still doing
the outlining for offload
mic - specify Intel(R) MIC Architecture
gfx - specify Intel(R) Graphics Technology
Enable -O3 -no-prec-div -fp-model fast=2 optimizations.
Optimize for maximum speed and enable more aggressive optimizations that may not improve performance on some programs.
Improve precision of FP divides (some speed impact).
-fp-model
enable
[no-]except - enable/disable floating point exception semantics
fast[=1|2] - enables more aggressive floating point optimizations
precise - allows value-safe optimizations
source - enables intermediates in source precision
sets -assume protect_parens for Fortran
strict - enables -fp-model precise -fp-model except, disables
contractions and enables pragma stdc fenv_access
consistent - enables consistent, reproducible results for
different optimization levels or between different
processors of the same architecture
double - rounds intermediates in 53-bit (double) precision
extended - rounds intermediates in 64-bit (extended) precision
May generate Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) Foundation instructions, Intel(R) AVX-512 Conflict Detection instructions, Intel(R) AVX-512 Doubleword and Quadword instructions, Intel(R) AVX-512 Byte and Word instructions and Intel(R) AVX-512 Vector Length Extensions for Intel(R) processors, and the instructions enabled with CORE-AVX2.
-qopt-zmm-usage=
Specifies the level of zmm registers usage. You can specify one of
the following:
low - Tells the compiler that the compiled program is unlikely to
benefit from zmm registers usage. It specifies that the
compiler should avoid using zmm registers unless it can
prove the gain from their usage.
high - Tells the compiler to generate zmm code without restrictions
-fimf-precision=value[:funclist]
defines the accuracy (precision) for math library functions
value - defined as one of the following values
high - equivalent to max-error = 0.6
medium - equivalent to max-error = 4 (DEFAULT)
low - equivalent to accuracy-bits = 11 (single
precision); accuracy-bits = 26 (double
precision)
funclist - optional comma separated list of one or more math
library functions to which the attribute should be
applied
Determine if certain square root optimizations are enabled.
Determine if certain square root optimizations are enabled.
Specifies whether streaming stores are generated:
always - enables generation of streaming stores under the assumption that the application is memory bound
auto - compiler decides when streaming stores are used (DEFAULT)
never - disables generation of streaming stores
Enable single-file IP optimization within files.
Enable multi-file IP optimization between files.
Enable levels of prefetch insertion, where 0 disables. n may be 0 through 5 inclusive. Default is 2.
Enable/disable(DEFAULT) use of ANSI aliasing rules in optimizations; user asserts that the program adheres to these rules.
Specifies preferred 512b vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.
Enable the compiler to generate multi-threaded code based on the OpenMP* directives. Similar behavior was granted by -qopenmp in previous versions.
Enable LTO (Link Time Optimization) in 'full' mode.
Allow aggressive, lossy floating-point optimizations.
Turn on loop unroller.
FPORTABILITY flag
No Fortran main method exists, use C equivalent instead.
Enables the use of nested SIMD statements for OpenMP.
Invoke the Intel C compiler.
Invoke the Intel C++ compiler.
Invoke the Intel Fortran compiler.
Invoke the Intel oneAPI C compiler.
Invoke the Intel oneAPI C++ compiler.
Invoke the Intel oneAPI Fortran compiler.
Link using FFTW 3.3.10 library for Linux. Description from FFTW:
FFTW lib compiled with -xCORE-AVX512 -qopt-zmm-usage=high -Ofast -fp-model fast=2
FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST).
Platform settings
One or more of the following settings may have been applied to the testbed. If so, the "Platform Notes" section of the report will say so; and you can read below to find out more about what these settings mean.
LD_LIBRARY_PATH=<directories> (linker)
LD_LIBRARY_PATH controls the search order for both the compile-time and run-time linkers. Usually, it can be defaulted; but testers may sometimes choose to explicitly set it (as documented in the notes in the submission), in order to ensure that the correct versions of libraries are picked up.
STACKSIZE=<n> (Unix)
Set the size of the stack (temporary storage area) for each slave thread of a multithreaded program.
ulimit -s <n> (Unix)
Sets the stack size to n kbytes, or "unlimited" to allow the stack size to grow without limit.