Copyright © 2006 Intel Corporation. All Rights Reserved.
Invoke the Intel C++ compiler for IPF Linux64 to compile C applications
Invoke the Intel C++ compiler for IPF Linux64 to compiler C++ applications
This macro specifies that the target system uses the LP64 data model; specifically, that integers are 32 bits, while longs and pointers are 64 bits.
This macro indicates that the benchmark is being compiled on an Intel IA64-compatible system running the Linux operating system.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
Portability changes for Linux
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
This flag can be set for SPEC compilation for Linux using default compiler.
The -fast option enhances execution speed across the entire program by including the following options that can improve run-time performance:
-O3 (maximum speed and high-level optimizations)
-ipo (enables interprocedural optimizations across files)
-static (link libraries statically)
To override one of the options set by -fast, specify that option after the -fast option on the command line. The options set by -fast may change from release to release.
Enables use of faster but slightly less accurate code sequences for math functions, including sqrt, reciprocal sqrt, divide and reciprocal. When compared to strict IEEE* precision, this option slightly reduces the accuracy of floating-point calculations performed by these functions, usually limited to the least significant digit. This option also performs reassociation transformations, which can alter the order of operations, over a larger scope. The increased reasssociation enables generation of more optimal sequences of floating-point multiply-add instructions than not using this option. Note that use of floating-point multiply-add can cause programs to produce different numerical results due to changes in rounding.
This option controls the prefetches that are issued before the loop is entered. These prefetches target the initial iterations of the loop. The default is -opt-prefetch-initial-values (prefetch for initial iterations on) at -O1 and higher optimization levels.
Tells the compiler to assume the program does adhere to the rules defined in the ISO C Standard. The default is to not assume such adherence. If your C/C++ program adheres to these rules, then -ansi-alias will allow the compiler to optimize more aggressively. If it doesn't adhere to these rules, then assuming so can cause the compiler to generate incorrect code.
The -fast option enhances execution speed across the entire program by including the following options that can improve run-time performance:
-O3 (maximum speed and high-level optimizations)
-ipo (enables interprocedural optimizations across files)
-static (link libraries statically)
To override one of the options set by -fast, specify that option after the -fast option on the command line. The options set by -fast may change from release to release.
Enables use of faster but slightly less accurate code sequences for math functions, including sqrt, reciprocal sqrt, divide and reciprocal. When compared to strict IEEE* precision, this option slightly reduces the accuracy of floating-point calculations performed by these functions, usually limited to the least significant digit. This option also performs reassociation transformations, which can alter the order of operations, over a larger scope. The increased reasssociation enables generation of more optimal sequences of floating-point multiply-add instructions than not using this option. Note that use of floating-point multiply-add can cause programs to produce different numerical results due to changes in rounding.
This option controls the prefetches that are issued before the loop is entered. These prefetches target the initial iterations of the loop. The default is -opt-prefetch-initial-values (prefetch for initial iterations on) at -O1 and higher optimization levels.
Tells the compiler to assume the program does adhere to the rules defined in the ISO C Standard. The default is to not assume such adherence. If your C/C++ program adheres to these rules, then -ansi-alias will allow the compiler to optimize more aggressively. If it doesn't adhere to these rules, then assuming so can cause the compiler to generate incorrect code.
The -Wl option directs the compiler to pass a list of arguments to the linker. In this case, "-z muldefs" is passed to the linker. For the Gnu linker (ld), the "-z keyword" option accepts several recognized keywords. Keyword "muldefs" allows multiple definitions. The muldefs keyword will enable, for example, linking with third party libraries like SmartHeap from Microquill.
MicroQuill SmartHeap Library available from http://www.microquill.com/
MicroQuill SmartHeap Library available from http://www.microquill.com/
This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.
Enables O2 optimizations plus more aggressive optimizations,
such as prefetching, scalar replacement, and loop and memory
access transformations. Enables optimizations for maximum speed,
such as:
- Loop unrolling, including instruction scheduling
- Code replication to eliminate branches
- Padding the size of certain power-of-two arrays to allow
more efficient cache use.
On Intel Itanium processors, the O3 option enables optimizations
for technical computing applications (loop-intensive code):
loop optimizations and data prefetch.
The O3 optimizations may not cause higher performance unless loop and
memory access transformations take place. The optimizations may slow
down code in some cases compared to O2 optimizations.
The O3 option is recommended for applications that have loops that heavily
use floating-point calculations and process large data sets.
Enables optimizations for speed. This is the generally recommended
optimization level. This option also enables:
- Inlining of intrinsics
- Intra-file interprocedural optimizations, which include:
- inlining
- constant propagation
- forward substitution
- routine attribute propagation
- variable address-taken analysis
- dead static function elimination
- removal of unreferenced variables
- The following capabilities for performance gain:
- constant propagation
- copy propagation
- dead-code elimination
- global register allocation
- global instruction scheduling and control speculation
- loop unrolling
- optimized code selection
- partial redundancy elimination
- strength reduction/induction variable simplification
- variable renaming
- exception handling optimizations
- tail recursions
- peephole optimizations
- structure assignment lowering and optimizations
- dead store elimination
Enables optimizations for speed and disables some optimizations that
increase code size and affect speed.
To limit code size, this option:
- Enables global optimization; this includes data-flow analysis,
code motion, strength reduction and test replacement, split-lifetime
analysis, and instruction scheduling.
- Disables intrinsic recognition and intrinsics inlining.
The O1 option may improve performance for applications with very large
code size, many branches, and execution time not dominated by code within loops.
On IPF Linux64 platforms, -O1 disable software pipelining and global code scheduling.
On Intel Itanium processors, this option also enables optimizations for server applications
(straight-line and branch-like code with a flat profile).
-unroll0, -fbuiltin, -mno-ieee-fp, -fomit-frame-pointer (same as -fp), -ffunction-sections
Tells the compiler the maximum number of times (n) to unroll loops.
Enables inline expansion of all intrinsic functions.
Disables conformance to the ANSI C and IEEE 754 standards for floating-point arithmetic.
Allows use of EBP as a general-purpose register in optimizations.
Places each function in its own COMDAT section.
Multi-file ip optimizations that includes:
- inline function expansion
- interprocedural constant propogation
- dead code elimination
- propagation of function characteristics
- passing arguments in registers
- loop-invariant code motion
-static prevents linking with shared libraries.
Platform settings
One or more of the following settings may have been set. If so, the "General Notes" section of the report will say so; and you can read below to find out more about what these settings mean.
limit stacksize unlimited
Sets the stack size to n kbytes, or unlimited to allow the stack size to grow without limit.
dplace [-c cpu_numbers] [-r [l|b|t] [-v 1|2]
Dplace is used to bind a related set of processes to specific cpus or nodes to prevent process migration. In some cases, this will improve performance since a higher percentage of memory accesses will to the local node.
Version 1 of numatools required kernel support for PAGG process placement groups. This support is no longer available in all kernel variants.
Version 2 of numatools uses a preload library to intercept calls to fork(), exec() (all variants), pthread_create() and pthread_exit(). The intercept code performs placement as part of the library call. In most cases, version 1 and version 2 are compatible. In some cases, however, a user will notice differences:
Cpusets
The cpuset facility is primarily a workload manager tool permitting a system administrator to restrict the number of processor and memory resources that a process or set of processes may use. A cpuset defines a list of CPUs and memory nodes. A process contained in a cpuset may only execute on the CPUs in that cpuset and may only allocate memory on the memory nodes in that cpuset. Essentially, cpusets provide you with a CPU and memory containers or soft partitions within which you can run sets of related tasks. Using cpusets on an SGI Altix system improves cache locality and memory access times and can substantially improve an application's performance and runtime repeatability. Restraining all other jobs from using any of the CPUs or memory resources assigned to a critical job minimizes interference from other jobs on the system.
The default cpuset for the init process, classic UNIX daemons, and user login shells is the root cpuset that contains the entire system. For systems dedicated to running particular applications, it is better to restrict init, the kernel daemons, and login shells to a particular set of CPUs and memory nodes called the boot cpuset.
/dev/cpuset/memory_spread_page, /dev/cpuset/memory_spread_cache
There are two Boolean flag files per cpuset that control where the kernel allocates pages for the file system buffers and related in kernel data structures. They are called memory_spread_page and memory_spread_slab.
If the per-cpuset Boolean flag file memory_spread_page is set, the kernel will spread the file system buffers (page cache) evenly over all the nodes that the faulting task is allowed to use, instead of preferring to put those pages on the node where the task is running.
If the per-cpuset Boolean flag file memory_spread_slab is set, the kernel will spread some file system related slab caches, such as for inodes and directory entries, evenly over all the nodes that the faulting task is allowed to use; instead of preferring to put those pages on the node where the task is running.
The setting of these flags does not affect anonymous data segment or stack segment pages of a task.
When new cpusets are created, they inherit the memory spread settings of their parent.
Setting memory spreading causes allocations for the affected page or slab caches to ignore the tasks NUMA mempolicy and be spread instead. Tasks using mbind() or set_mempolicy() calls to set NUMA mempolicies will not notice any change in these calls as a result of their containing tasks memory spread settings. If memory spreading is turned off, the currently specified NUMA mempolicy once again applies to memory page allocations.
Both memory_spread_page and memory_spread_slab are Boolean flag files. By default they contain "1", meaning that the feature is on for that cpuset. If a "0" is written to that file, that turns the named feature off.
SGI ProPack for Linux
SGI ProPack is a suite of performance optimization libraries and tools for SGI Linux systems. It includes application accelerators such as NUMAtools and Flexible File I/O, parallel programming tools such as the Message Passing Toolkit, real-time performance via SGI REACT, and performance monitoring tools such as Performance Co-Pilot.
The dplace utility from the ProPack NUMAtools package is used to pin processes in CPU2006 rate runs.
Flag description origin markings:
For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact webmaster@spec.org
Copyright 2006-2014 Standard Performance Evaluation Corporation
Tested with SPEC CPU2006 v1.1.
Report generated on Wed Jul 23 03:41:32 2014 by SPEC CPU2006 flags formatter v6906.