SPEC CPU2017 Config Files

$Id: config.html 5817 2017-06-22 13:19:28Z JohnHenning $ Latest: www.spec.org/cpu2017/Docs/

I. Config file basics

A. What is a config file?

B. Benchmark selection

C. Five consumers

D. Syntax

1. Sections and scope


3. Whitespace

4. Quoting

5. Line continuation

6. Included files

7. Section markers

E. Section merging

Precedence Examples
1. Benchmark specifier
2. Tuning specifier
3. Label specifier
4. Combining sections
5. Conflicting sections
6. Section order
7. Repeated Sections

F. Variables

1. Defining variables

2. $[square] substitution

3. Useful $[variables]

4. ${curly} interpolation

5. Useful ${variables}

6. Unsetting "%undef%"

7. Debug tips

II. Config file options for runcpu

A. Precedence: config file vs. runcpu command line

B. Options

action allow_label_overrideNew backup_config basepeak bind check_version command_add_redirect copies current_range delay deletework difflines enable_monitorNew env_vars expand_notes expid fail fail_build fail_run feedback flagsurl force_monitorNew http_proxy http_timeout idle_current_range ignore_errors ignore_sigint info_wrap_columns iterations keeptmp line_width labelNew locking log_line_width log_timestamp mail_reports mailcompress mailmethod mailport mailserver mailto make make_no_clobber makeflags mean_anyway minimize_builddirs minimize_rundirs no_input_handler no_monitor nobuild notes_wrap_columns notes_wrap_indent output_format output_root parallel_test parallel_test_submit parallel_test_workloadsNew plain_train powerNew power_analyzer preenv rebuild reportable runlist save_build_filesNew section_specifier_fatal setprocgroup size src.alt strict_rundir_verify sysinfo_program table teeout temp_meter train_single_threadNew train_with tune use_submit_for_compareNew use_submit_for_speed verbose verify_binariesNew version_url voltage_range

III. Config file options for specmake

A. Commonly used Make variables

PORTABILITY (+ several similar options)
OPTIMIZE (+etc.)

B. New with CPU2017: Using OpenMP and/or Autopar

1. Summary

2. SPECrate: no OpenMP. No Autopar.

3. SPECspeed: your choice

4. Detail - conditions for enabling OpenMP

C. Creating your own Make variables

D. += is available but use with caution

E. Using buildsetup to create a sandbox

F. About Automatic Rebuilds

{C|CXX|F}C_VERSION_OPTION New and required with CPU2017

IV. Config file options for the shell

A. \$SHELLVAR variable substitution

B. Options

bench_post_setup build_pre_bench build_post_bench fdo_pre0 fdo_preN fdo_make_cleanN fdo_pre_makeN fdo_make_passN fdo_post_makeN fdo_runN fdo_postN mointor_X post_setup submit

C. Using submit

1. Basic usage

2. Script generation

3. Quote traps

4. Debug tips

5. Maintainability

6. Reporting

Submit Examples:

1. ${command} and $SPECCOPYNUM

2. bind list

3. Generating scripts and command_add_redirect

4. Generating a script for numactl

5. Line continuation

6. Magical notes_submit

V. Config file options for human readers

A. Descriptive fields

fw_biosNew hw_avail hw_cpu_max_mhzNew hw_cpu_name hw_cpu_nominal_mhzNew hw_disk hw_memory hw_model hw_nchips hw_ncores hw_ncpuorder hw_nthreadspercore hw_ocache hw_other hw_pcache hw_power_{id}_cal_dateNew hw_power_{id}_cal_label hw_power_{id}_cal_org hw_power_{id}_met_inst hw_power_{id}_connection hw_power_{id}_label hw_power_{id}_model hw_power_{id}_serial hw_power_{id}_setup hw_power_{id}_vendor hw_psu hw_psu_info hw_scache hw_tcache hw_vendor license_num prepared_by sw_avail sw_base_ptrsize sw_compiler sw_file sw_os sw_state sw_other sw_peak_ptrsize tester test_sponsor

B. Fields can vary by scope

C. Additional notes for the reader

1. Notes sections

notes_comp_NNN notes_port_NNN notes_base_NNN notes_peak_NNN notes_submit_NNN notes_os_NNN notes_plat_NNN notes_part_NNN notes_NNN

2. Note numbering

3. Additional tags

4. Links in notes sections

D. Parallel Reporting

VI. Using Feedback Directed Optimization (FDO)

A. Minimum required: PASSn or fdo_

B. Flexible build models

C. The config file feedback option

D. runcpu --feedback

FDO Examples:

1. build - train - rebuild

2. build - train - othertool

3. Finding fdo commands with --fake

4. build - train - partial rebuild

5. build-train-build-train-build-train-postopt + file movement

6. feedback=no

7. runcpu --feedback

8. FDO tracks in the log file

VII. The config file preprocessor

A. Introduction

1. Example: Picking flags

2. Example: Picking CPUs

3. Syntax Basics

B. Defining macros

1. In a config file

2. On the command line

3. Predefined macros: endian, hostname...
%{ENV_variable_name} New with CPU2017

4. Example: Adjusting the Environment

5. Redefinition

C. Un-doing macro definition

D. Using macros

E. Conditionals

1. %ifdef .. %endif

2. %ifndef .. %endif

3. %if .. %endif

4. %else

5. %elif

F. Informational directives

1. %warning

2. %error

3. %info New with CPU2017

4. %dumpmacros New with CPU2017

VIII. Output files - and how they relate to your config file

A. Help, I've got too many config files!

B. The log file

1. Useful Search Strings

2. About Temporary Debug Logs

3. Verbosity levels

C. Deciphering an FDO log file

D. Help, I've got too many log files!

E. Finding the build directory

F. Files in the build directory

G. For more information


A. Other benchmark sets

B. Troubleshooting

C. Obsolete features

allow_extension_override company_name ext hw_cpu_char hw_cpu_mhz hw_cpu_ncoresperchip hw_fpu mach machine_name max_active_compares rate speed sw_auto_parallel test_date tester_name VENDOR


I. Config file basics

I.A. What is a config file?

A SPEC CPU config file is a file that defines how to build, run, and report on the SPEC CPU benchmarks in a particular environment. It defines how SPEC CPU2017 interacts with your test environment, and lets you describe that environment in human-readable form.

A config file provides detailed control of testing, including:

  • Tuning and portability, which you may customize for your compilers.
  • The runtime environment, which you may customize for your operating system.
  • A description of the SUT (System Under Test), which you should customize to create an accurate report.

Using your customized options, the SPEC CPU tools automatically create Makefiles, build the benchmarks, run them, generate reports, and write log files.

Because they collect your options into one place, config files are key to result reproducibility.

For example, if a vendor publishes CPU2017 results for the SuperHero Model 42 at the SPEC web site www.spec.org/cpu2017, it is expected that a customer can demonstrate similar performance using only 3 ingredients:

  1. The SPEC CPU2017 media, ordered from SPEC.
  2. The software and hardware as specified in the full disclosure from the published result.
  3. The config file, which is posted along with the results.

q. This document looks big and intimidating. Where do I start?

a. Don't start here. Start with the Overview and Using SPEC CPU2017 - the 'runcpu' Command.

Afer that, please read section I.C carefully, which explains that config files contain options for five (5) different consumers. You need to recognize which options are for which consumers.

Please notice that config files have 3 kinds of sections. You need to know how named sections work.

From that point on, you can probably skip around among topics as they may interest you.

Tip: Most topics can be found by adding '#topic' to the URL for this document. Examples: #consumers  #shell  #readers  #sw_compiler  #OpenMP

I.B. Benchmark selection

In a config file, you can reference: One or more individual benchmarks, such as 500.perlbench_r, or entire suites, using the Short Tags in the table below.

Suite Contents Metrics How many copies?
What do Higher Scores Mean?
intspeed SPECspeed 2017 Integer 10 integer benchmarks SPECspeed2017_int_base
SPECspeed suites always run one copy of each benchmark.
Higher scores indicate that less time is needed.
fpspeed SPECspeed 2017 Floating Point 10 floating point benchmarks SPECspeed2017_fp_base
intrate SPECrate 2017 Integer 10 integer benchmarks SPECrate2017_int_base
SPECrate suites run multiple concurrent copies of each benchmark.
The tester selects how many.
Higher scores indicate more throughput (work per unit of time).
fprate SPECrate 2017 Floating Point 13 floating point benchmarks SPECrate2017_fp_base
The "Short Tag" is the canonical abbreviation for use with runcpu, where context is defined by the tools. In a published document, context may not be clear.
To avoid ambiguity in published documents, the Suite Name or the Metrics should be spelled as shown above.

(Other benchmarks sets are available, but must be used with great caution. They are described in Appendix A.)

I.C. Five consumers

A config file has content for five (5) distinct consumers, as shown in the table.
To understand a config file, you must understand which consumer is addressed at any given point.
Column 3 below provides a few examples for each; click the roman numerals in column 2 for many more.

Consumer List of
Examples Brief description
runcpu II copies

Options that change how runcpu itself works. Many should be familiar from Using SPEC CPU2017.
Putting your desired options in the config file can simplify the command line. For example, if michael.cfg includes the lines:

output_format = text,csv
tune          = base
reportable    = yes
runlist       = fpspeed 

then the defaults for the runcpu command would change as shown. Both of these would do the same thing:

runcpu --config=michael
runcpu --config=michael --output=text,csv --tune=base --reportable fpspeed 

Make variables, to control benchmark builds via specmake.

Commonly used specmake variables: section III
Many more: Make Variables
Using specmake on the command line: Utilities

IV fdo_post1

Commands to be executed by the Unix shell (/bin/sh) or the Windows command interpreter (cmd.exe).

fdo_pre0 = rm -Rf /tmp/manmohan/feedback
fdo_pre1 = mkdir /tmp/manmohan/feedback 

Above, a directory (and all its contents) are deleted, and a new one created, prior to a training run for feedback-directed optimization.

Warning: SPEC CPU config files can execute arbitrary shell commands.
Read a config file before using it.
Don't be root. Don't run as Administrator. Turn privileges off.

readers V hw_model

System Under Test (SUT) description, with enough detail so that the reader can understand what was tested and can reproduce your results. If a SPEC CPU2017 result is published (whether at SPEC or independently) it must be fully described.

preprocessor VII %define

Preprocessing directives and definitions to adjust your config file prior to use. All preprocessor directives begin in column 1. Example:

%if %{bits} == 64
%   define model  -m64
%   define model  -m32
CC  = gcc %{model} 

I.D. Config file syntax

I.D.1. Sections and scope

A config file contains: a header section, named sections, and a HASH section.

Scope: Every line is considered to be within the scope of one of these three. Lines prior to the first section marker are in the scope of the header section. All other lines are either in the scope of the most recently preceding section marker or in the HASH section.

Sections Description Example
header section

The header section is the first section, prior to the any named section. Most attempts to address runcpu itself are done in the header section. In the example, lines 1 through 6 are in the header section.

$ cat -n threeSections.cfg
 1  flagsurl             = $[top]/config/flags/gcc.xml
 2  iterations           = 1
 3  output_format        = text
 4  runlist              = 519.lbm_r
 5  size                 = test
 6  tune                 = peak
 7  fprate=peak:
 9     CC                = gcc
10     OPTIMIZE          = -O2
11  519.lbm_r=peak:
12     OPTIMIZE          = -O3
14  __HASH__
15  519.lbm_r=peak=none:
16  # Last updated 2017-02-06 14:29:40
17  opthash=ff6059d6d9ec9577f7f49d05178c58688f31004089
18  baggage=
19  compiler_version=\
20  @eNo1jbEOgjAYhPc+RUcdaFEJIWwGHEhQjBLjRn7LLzapLWkL8
21  kpP9OCqkdX070hmtk0bTjMUspiuhQA9RFgfDki3brEkLdkCf0y
22  0NO36VHldDROLqTSzoNS2JfS5pT/DqUAH54cv4tAsjDEC6M9au
23  FuH/CZ+c5Q+pyRd+tUlX
24  compile_options=\
25  @eNp1T11PgzAUfe+vaPrOMjNfJGMJlLqhQBtLX3xptHYGBWoKM
26  vNp93Vjs3obadX2I+sHXZtD+0D3VXr9bX+8/InJBEMAeKBFZLq
27  KgWj06epUADLlCVqe+rFquJaKiHumJSaC1YWAuOAr/DPWvfu4I
28  mALF6zzeSpj96+RITFhJd3rG/dMaQTzEIJYV2T0DJl8RlGfl7Z
29  hjyd8vw+D9MirnY6z5LRW9NOC1yNkc/OfAJtxnuD
30  exehash=5290fe504238c6de1a13e275ab8ca11e035fbb4e7e

Options for the header section may also be entered in section


or a shorter spelling thereof, such as


This can be helpful if you need to effectively return to the header section, perhaps when using include.

Tip: Nevertheless, it is usually easier to maintain a config files that keeps all runcpu options near the top, preferably in alphabetical order.

Named sections

A named section is a portion of the config file that follows a line that contains a section marker.

Briefly, a section marker is a 1-to-3 part string with equals signs in the middle and a colon at the end; see detail below. The example has 2 named sections, delimited by 2 section markers on lines 7 and 11.

Notice that the example sets OPTIMIZE in both of the named sections. To understand which one gets used, see the precedence rules, which describe how sections interact.

HASH section

The HASH Section is the final, automatically-generated section. It starts with __HASH__ and contains checksums. The example starts the HASH section at line 14. (For readability, the lines are truncated on the right.)

The automatically-updated checksums ensure that binaries match their config file. You can optionally disable checking, but doing so is strongly discouraged. See verify_binaries.

Config files printed by --output_format=config do not include the HASH section.

I.D.2. Comments

Comments begin with '#'. There are two types:

Syntax Type Saved? Detail
# Regular Yes Regular comments can be full-line or trailing. A copy of your config file is saved with the test results. If you submit your results to SPEC for publication, the regular comments can be read by anyone.
#> Protected No

A full-line comment that begins with #> is a protected comment and will not be saved with your results. You can use protected comments for proprietary information, for example:

#> I didn't use the C++ beta because of Bob's big back-end bug.  

Both types of comments are ignored when processing the config file.

Full-line comments: If # is the first non-blank item on a line, then the whole line is a comment. Comment lines can be placed anywhere in a config file.

Trailing comments: If a line has non-blank items, you can (usually) add regular comments. You cannot write a protected trailing comment. If you try to use a protected comment after some other element, it is treated as a regular comment.

All comments below will be saved except the one that says 'NOT saved'.

                       # New optimizers.
default=base:          # Most optimizers go up to ten.
   OPTIMIZE   = -O11   #    These go to eleven.
                       #> This comment is NOT saved
   COPTIMIZE  = -std   #> This comment is saved

Not a comment: These instances of # do not start a comment:

I.D.3. Whitespace

Blank lines can be placed anywhere in a config file. They are ignored.

Spaces at the beginning of lines are ignored, with the exception that preprocessor directives always begin with a percent sign (%) in the first column. You can put spaces after the percent sign, if you wish (example).

Spaces within a line are usually ignored. Don't try to break up a section marker, and you can't break up a single word (say 'OPTIMIZE' not 'OPT I MIZE'). If multiple spaces separate line elements, it is as if there were only one space. Each of these have the same meaning:

   OPTIMIZE=-O2 -noalias
   OPTIMIZE   =  -O2      -noalias

One place where spaces are considered significant is in notes, where you can use spaces to line up your comments. Notes are printed in a fixed-width font.

Trailing spaces and tabs are ignored, unless they are preceded by a backslash. For example, if space.cfg contains:

$ cat trailing_space.cfg 
PATH1   = /path/without/any/trailer
FC      = $(PATH1)/f90
PATH2   = /path/with/trailing/space\ 
CC      = $(PATH2)/cc

then we can use fake to demonstrate the compile commands that would be generated (+ Unix commands grep, head, and cut to pick out one example).

$ cat trailing_space.sh 
runcpu -c space --fake 527.cam4 | grep without/ | head -1 | cut -b 1-80
runcpu -c space --fake 527.cam4 | grep trailing | head -1 | cut -b 1-80
$ ./trailing_space.sh 
/path/without/any/trailer/f90 -c -o ESMF_BaseMod.fppized.o -I. -Iinclude -Inetcd
/path/with/trailing/space /cc -c -o GPTLget_memusage.o -DSPEC -DSPEC_CPU -DNDEBU
$  (Notes about examples) 

Notice that the PATH2 trailing space is preserved

I.D.4. Quoting

If you use double (") or single (') quotes within a config file, runcpu leaves them alone. The assumption is that you put them there because one of the consumers (such as a shell) needs them. The quotes are not significant to runcpu but may be highly significant to the consumer. See the section on quote traps.

If you use a backslash (\) it is usually not significant. The exceptions are:

I.D.5. Line continuation

Many fields, including most reader fields, can be continued by adding a number:

sw_os1         = Turboblaster OS V1.0
sw_os2         = (Tested with Early Hardware Release 0.99
sw_os3         = and Patch 42.)
hw_disk105     = 42 TB on 6x 8 TB 10K RPM SAS Disks
hw_disk110     = arranged as 4x 2-way mirrors; plus  
hw_disk115     = Turboblaster Disk Accelerator 

The fields which cannot be continued are the ones that are expecting a simple integer, such as hw_nchips and license_num; and the ones that expect a date, such as hw_avail. You can pick your own style of numbering, as in the examples above. (Note: the stored results from your test always use three-digit numbers, and have slightly different syntax, as discussed in utility.html.)

Shell-style "here documents" with double angle brackets and a delimiter word (e.g. <<EOT) can be used to set multi-line values. Backslash-continued lines are also supported. For example:

$ cat continued_lines.cfg
expand_notes   = 1
output_format  = text
output_root    = /tmp/fake_lines
runlist        = 519.lbm_r

here_continued = <<EOT
      + This is +
       + a test  +

backslash_continued = + So is +\
 + this  +

notes1 = ${here_continued}
notes2 = ${backslash_continued}
$ cat continued_lines.sh 
runcpu --config=continued_lines --fakereport | grep txt
grep '+' /tmp/fake_lines/result/CPU2017*txt
$ ./continued_lines.sh 
    format: Text -> /tmp/fake_lines/result/CPU2017.001.fprate.refrate.txt
           + This is +
           + a test  +
     + So is +
     + this  +
$   (Notes about examples) 

I.D.6. Included files

You can include other files in your config file using include:
Multiple files may be included.
Included files may use macros (and you can use configpp to check the effect).
Included files may write to arbitrary sections, including (effectively) the header section.

Example: a config file is developed on one system, and applied on a different System Under Test (SUT). The compilers are, of course, installed on the development system. They might not be present on the SUT.

The compiler is described in the main config file on lines 10-12.
The hardware is described in SUT.inc which is included via line 6.

$ cat -n include.cfg
 1  iterations    = 1
 2  output_format = text
 3  output_root   = /tmp/example
 4  runlist       = 519.lbm_r
 5  size          = test
 6  include: SUT.inc
 7  default:
 8    CC                = gcc
10    sw_compiler001    = C/C++/Fortran: Version 6.2.0 of GCC
11    sw_compiler002    = the GNU Compiler Collection
12    sw_avail          = Aug-2016

Note the software date (sw_avail) on line 12 above.
Today's SUT is the Turboblaster SuperHero IV.
Copy its information to the expected place.

$ cp Turboblaster.inc SUT.inc
$ cat Turboblaster.inc 
   hw_model  = SuperHero IV
   hw_avail  = Feb-2018
   hw_vendor = Turboblaster

Note the hardware date (hw_avail) in the include file.
Both dates are referenced in reports:

$ cat include.sh 
runcpu --config=include | grep txt
grep avail /tmp/example/result/*txt
$ ./include.sh 
format: Text -> /tmp/example/result/CPU2017.001.fprate.test.txt
Test sponsor: Turboblaster      Hardware availability: Feb-2018
Tested by:    Turboblaster      Software availability: Aug-2016
$ (Notes about examples) 

Notes on the Examples

Many of the examples that begin with cat somefile are on your installed copy of SPEC CPU2017, in directory $SPEC/config/tiny-examples or %SPEC%\config\tiny-examples.

cat example.cfg Display config file. Windows users can use type.

cat example.sh For config files with many lines of output, a corresponding .sh script file has commands to pick out a subset. Windows users could create a .bat file with call runcpu.

cat -n Add line numbers.

CC_VERSION_OPTION How to say to the compiler "Please tell me your version". A required option for all except fake runs. (detail)

cp Copy a file. Windows users could say copy.

cut, head, tail Various ways to subset output.

default=base: A section marker

--fake Many examples use --fake or --fakereportable, to quickly provide a dry run demonstration. You can try fake examples without even installing a compiler.

iterations=1 Demonstrate using just one repetition. Warning: Reportable runs use at least 2.

grep Search the output. Windows users could try findstr

ls List files. Windows users could say dir.

output_root Send results to named directory.

runlist Often, 519.lbm_r is demonstrated because it compiles quickly.

size=test For demonstration purposes, use the very short workload. Warning: The intent of 'test' is just a sanity check that a binary works. It is not a comparable metric. If you publish results, you must use the 'ref' workloads.

For other fields, see table of contents. Most examples were tested with a SPEC CPU2017 release candidate and this document contains their actual output, except: white space was liberally edited and (in a very few cases) line width was reduced by chopping out words.

I.D.7. Section markers and Named sections

A named section begins with a section marker and continues until the next section marker or the HASH section is reached.
Named sections can be entered in any order.
Section markers can be repeated. Material from repeated sections is automatically consolidated.

A section marker is a one- to three-part string of the form:


The three parts of a section marker are called the section specifiers, with allowed values:

Allowed values
benchmark default
A metric: intrate, intspeed, fprate, or fpspeed
An Auxiliary Benchmark Set (not recommended; see warning)
Any individual benchmark, such as 503.bwaves_r
A list, such as: 503.bwaves_r,603.bwaves_s
tuning default
A list of tuning levels: base,peak
label default
Any specific label: an arbitrary tag to identify binaries and directories
A list of labels, separated by commas

Trailing default section specifiers may be omitted from a section marker. In the pairs below, in each case, the second line is equivalent to the first:

intrate=default=default:     628.pop2_s=base=default:      default=default=default:
intrate:                     628.pop2_s=base:              default:

I.E. Section merging and precedence rules

By constructing section markers, you determine how you would like your options applied. Benchmarks are built according to instructions in the sections that they match, subject to rules for combining sections and resolving conflicts among them. Sections are combined using these rules.
Click down arrow to go to an example.

Precedence Example 1: benchmark specifiers

For the benchmark specifier, the precedence is:

  highest   named benchmark(s)
            suite name
  lowest    default

The flagsurl line picks up definitions that are provided with SPEC CPU2017.

519.lbm_r is an fprate benchmark, but it does not use the setting on lines 14-15. Instead, it uses the higher precedence lines 10-11 (named benchmark).

619.lbm_s is an fpspeed benchmark. The highest precedence section for it is on lines 12-13.

505.mcf_r is an intrate benchmark. It gets the low-precedence OPTIMIZE setting from lines 16-17.

$ cat -n precedence_example1.cfg
     1  flagsurl             = $[top]/config/flags/gcc.xml
     2  iterations           = 1
     3  output_format        = text
     4  output_root          = /tmp/ptest
     5  runlist              = 519.lbm_r,619.lbm_s,505.mcf_r
     6  size                 = test
     7  default:
     8     CC_VERSION_OPTION = -v
     9     CC                = gcc
    10  519.lbm_r:
    11     OPTIMIZE          = -O3
    12  fpspeed:
    13     OPTIMIZE          = -O2
    14  fprate:
    15     OPTIMIZE          = -O1
    16  default:
    17     OPTIMIZE          = -O0
$ cat -n precedence_example1.sh 
     1  runcpu --config=precedence_example1 | grep txt
     2  cd /tmp/ptest/result
     3  grep 'O[0-9]' *txt
$ ./precedence_example1.sh 
    format: Text -> /tmp/ptest/result/CPU2017.001.fprate.test.txt
    format: Text -> /tmp/ptest/result/CPU2017.001.fpspeed.test.txt
    format: Text -> /tmp/ptest/result/CPU2017.001.intrate.test.txt
CPU2017.001.fprate.test.txt: 519.lbm_r: -O3
CPU2017.001.fpspeed.test.txt: 619.lbm_s: -O2
CPU2017.001.intrate.test.txt: 505.mcf_r: -O0
$ (Notes about examples) 
Precedence Example 2: tuning specifiers

For the tuning specifier, base or peak has higher precedence than default.

The first few lines of the config file use similar features as the previous example.

The tuning from line 12 is used for base, and line 14 for peak.

$ cat -n precedence_example2.cfg 
     1  flagsurl             = $[top]/config/flags/gcc.xml
     2  iterations           = 1
     3  output_format        = text
     4  output_root          = /tmp/ptest2
     5  runlist              = 519.lbm_r
     6  size                 = test
     7  default=default:
     8     CC_VERSION_OPTION = -v
     9     CC                = gcc
    10     OPTIMIZE          = -O0
    11  default=base:
    12     OPTIMIZE          = -O1
    13  default=peak:
    14     OPTIMIZE          = -O3
$ cat precedence_example2.sh 
runcpu --config=precedence_example2 --tune=base | grep txt
runcpu --config=precedence_example2 --tune=peak | grep txt
cd /tmp/ptest2/result
grep 'O[0-9]' *txt
$ ./precedence_example2.sh 
    format: Text -> /tmp/ptest2/result/CPU2017.001.fprate.test.txt
    format: Text -> /tmp/ptest2/result/CPU2017.002.fprate.test.txt
CPU2017.001.fprate.test.txt: 519.lbm_r: -O1
CPU2017.002.fprate.test.txt: 519.lbm_r: -O3
$  (Notes about examples) 
Precedence Example 3: label specifiers

For the label specifier, any named label has higher precedence than the default.

This config file is simpler than the previous examples, because we don't even bother to run it; instead, --fake is used.

The runcpu command uses --label=OhTwo. Therefore, the default setting for OPTIMIZE on line 6 is over-ridden by the settings on lines 7-8.

$ cat -n precedence_example3.cfg 
 1  runlist              = 619.lbm_s
 2  size                 = test
 3  fpspeed=base=default:
 5     CC                = gcc
 6     OPTIMIZE          = -O0
 7  fpspeed=base=OhTwo:
 8     OPTIMIZE          = -O2
$ cat precedence_example3.sh 
runcpu --config=precedence_example3 --fake --label=OhTwo | grep lbm.c
$ ./precedence_example3.sh 
gcc -c -o lbm.o -DSPEC -DSPEC_CPU -DNDEBUG -DLARGE_WORKLOAD  -O2 lbm.c
$  (Notes about examples) 
Precedence Example 4: Combining sections

Combine sections that apply to a benchmark, if there is no conflict among them.

Note that line 1 sets the label, and line 3 sets the tuning.

All sections -- including lines 6, 8, 10, and 12 -- contribute to the compile command, which has been wrapped for readability.

$ cat -n precedence_example4.cfg 
  1  label                = wall
  2  runlist              = 619.lbm_s
  3  tune                 = peak
  4  default:
  5     CC_VERSION_OPTION = -v
  6     CC                = gcc
  7  fpspeed:
  8     OPTIMIZE          = -O1 
  9  default=peak:
 10     COPTIMIZE         = -ftree-vectorize
 11  default=default=wall:
 12     EXTRA_COPTIMIZE   = -Wall
$ cat precedence_example4.sh  
runcpu --config=precedence_example4 --fake | grep lbm.c
$ ./precedence_example4.sh 
     -O1 -ftree-vectorize           -Wall     lbm.c
$  (Notes about examples) 
Precedence Example 5: Conflicting sections

If sections conflict with each other, the order of precedence is:

highest     benchmark
lowest      label 

The first runcpu command includes --label=wall. It uses the OPTIMIZE setting from lines 13-14, which have higher precedence than the default (lines 15-16).

The second runcpu command includes both --label=wall and --tune peak. The OPTIMIZE setting from lines 11-12 is used.

The third also uses --label=wall --tune peak; and runs 619.lbm_s, which is an fpspeed benchmark. The OPTIMIZE setting from lines 9-10 is used.

$ cat -n precedence_example5.cfg 
  1  flagsurl             = $[top]/config/flags/gcc.xml
  2  iterations           = 1
  3  output_format        = text
  4  output_root          = /tmp/ptest
  5  size                 = test
  6  default:
  7     CC_VERSION_OPTION = -v
  8     CC                = gcc
  9  fpspeed:
 10     OPTIMIZE          = -O3 
 11  default=peak:
 12     OPTIMIZE          = -O2
 13  default=default=wall:
 14     OPTIMIZE          = -O1
 15  default:
 16     OPTIMIZE          = -O0
$ cat precedence_example5.sh 
runcpu --fake --config=precedence_example5 --label=wall 519.lbm          | grep lbm.c
runcpu --fake --config=precedence_example5 --label=wall -T peak  519.lbm | grep lbm.c
runcpu --fake --config=precedence_example5 --label=wall -T peak  619.lbm | grep lbm.c
$ ./precedence_example5.sh 
gcc -c -o lbm.o -DSPEC -DSPEC_CPU -DNDEBUG -DLARGE_WORKLOAD  -O3               lbm.c
$  (Notes about examples) 
Precedence Example 6: Section order

For sections at differing precedence levels, order does not matter.

These two config files use a different order but perform the same functions when runcpu applies them..

$ diff --side-by-side precedence_example6a.cfg precedence_example6b.cfg
iterations           = 1                            iterations           = 1
output_format        = text                         output_format        = text
output_root          = /tmp/ptest                   output_root          = /tmp/ptest
runlist              = 519.lbm_r                    runlist              = 519.lbm_r
size                 = test                         size                 = test
default:                                            default:
   CC_VERSION_OPTION = -v                              CC_VERSION_OPTION = -v
   CC                = gcc                             CC                = gcc
519.lbm_r:                                        <
   OPTIMIZE          = -O3                        <
fprate:                                           <
   OPTIMIZE          = -O1                        <
default:                                            default:
   OPTIMIZE          = -O0                             OPTIMIZE          = -O0
                                                  > fprate:
                                                  >    OPTIMIZE          = -O1
                                                  > 519.lbm_r:
                                                  >    OPTIMIZE          = -O3
$ cat precedence_example6.sh
runcpu --fake --config=precedence_example6a | grep lbm.c
runcpu --fake --config=precedence_example6b | grep lbm.c
$ chmod +x precedence_example6.sh
$ ./precedence_example6.sh 
$  (Notes about examples) 
Precedence Example 7: Repeated Sections

If a section occurs more than once, the settings are combined. If there are conflicts, the last instance wins.

Recall that trailing default specifiers can be dropped. Therefore, these three section markers:

 7  fpspeed:
12  fpspeed=default:
17  fpspeed=default=default:

actually name the same section, which is entered three times.

The EXTRA_CFLAGS, COPTIMIZE, and EXTRA_COPTIMIZE settings (lines 8, 14, 19) are combined.

The OPTIMIZE settings on lines 8, 13, and 18 conflict with each other. Only the last one is retained.

$ cat -n precedence_example7.cfg
  1  label                = wall
  2  runlist              = 619.lbm_s
  3  tune                 = peak
  4  default:
  5     CC_VERSION_OPTION = -v
  6     CC                = gcc
  7  fpspeed:
  8     OPTIMIZE          = -O1 
  9     EXTRA_CFLAGS      = -finline-functions
 10  intrate:
 11     OPTIMIZE          = -O0 
 12  fpspeed=default:
 13     OPTIMIZE          = -O2 
 14     COPTIMIZE         = -ftree-vectorize
 15  intrate=peak:
 16     OPTIMIZE          = -O0 
 17  fpspeed=default=default:
 18     OPTIMIZE          = -O3 
 19     EXTRA_COPTIMIZE   = -Wall
$ cat precedence_example7.sh 
runcpu --config=precedence_example7 --fake | grep lbm.c 
$ ./precedence_example7.sh 
    -O3 -ftree-vectorize  -finline-functions  -Wall   lbm.c
$  (Notes about examples) 

I.F. Variables and Variable Substitution

You can do variable substitution within a config file.
As described above, the contents of a config file are directed to various consumers.
Therefore, to use variable substitution, you need you to know which consumer does the substitution.
Differing syntax is used for each.

q. Wait a minute... all these choices for substitution? Which do I want?

a. Probably either the first in the list: specmake; or the last: the preprocessor.

Format Example Description + consumer Traps for the unwary
$(round) $(COMPILER_DIR) Make variables, interpreted by specmake  
\$SHELLVAR \$SPEC Shell or command interpreter variables. Quoting
$[square] $[top]/config Unchanging items, substituted by runcpu at startup. Timing
Changeable items subject to perl interpolation.
$unbracketed is allowed if it is not ambiguous.
Try not to confuse with %{curly}
%{ENV_var} %{ENV_LIBRARY_PATH} Predefined macros for environment variables, handled by the preprocessor.  
%{curly} %{bits} macros, handled by the preprocessor Try not to confuse with ${curly}

I.F.1. Defining variables

You can create your own runcpu variables using a line of the form

name = value

The name may contain only letters, digits, and underscores (a hyphen is NOT allowed).
Start with a letter.
You may indent your definitions if you wish (see: whitespace)

Exception: preprocessor macros are different on all of the above. Hyphens are allowed, use

%define name value

and the % must appear in column 1. You can indent after the % if you wish.

Conventions: Although not required, certain conventions are usually followed:

  1. lower case is typically used for macros. When combining words, CamelCase may be useful.
  2. lower case is also typically used for runcpu variables, combining words with underscores
  3. CAPITALS are typically used when creating make variables
  4. CAPITALS are also typically used for shell or environment variables.

Examples of the above (in the same order)

%ifndef processorNumaControl
%   define processorNumaControl firstTouch # macro
    my_submit_cmd       = numactl -C $BIND # runcpu variable
    MYTUNE              = -O2 --math=SIMD  # make variable
    ENV_LD_LIBRARY_PATH = /opt/lib         # environment variable

The remainder of this section I.F is about runcpu variables -- the $[square] and ${curly} rows from the table at the top.

I.F.2. $[square] substitution at startup

Immediately after preprocessing, variables that are delimited by $[square brackets] are substituted.
Any value set in the config file can be substituted, provided that is visible in the current scope.
You can access the value of additional variables that you may have created.
See the list of useful variables below.
Perhaps the most useful is $[top] for the top of the SPEC CPU2017 tree, often found in contexts such as:

    flagsurl01          = $[top]/config/compiler.xml
    flagsurl02          = $[top]/config/platform.xml
    EXTRA_LIBS          = $[top]/mylibs
    preENV_LIBRARY_PATH = $[top]/lib64:$[top]/lib

Traps for the unwary: In some cases it may be obvious to the human which value to use, but the tools aren't as smart as you.

  1. Timing: You cannot substitute variables that do not exist, or whose values aren't known when the config file is first read. For example, the label is not sorted out until later. Values that pertain to individual benchmarks (such as the current iteration number or the name of the benchmark being run) are not set until much later, when benchmarks are actually being run.
  2. Command line: If a value is mentioned on both the runcpu command line and in the config file, the substitution might not do what you expect.
  3. output_root:
    You cannot set an output_root that depends on a runcpu variable.
    You can set one that uses a macro:

    output_root=$[top]/my/path      # wrong
    output_root=${top}/my/path      # wrong
    output_root=%{ENV_SPEC}/my/path # right

Square substitution is done early. That comes in handy if you need a variable right away, for example, in order to use it with preENV.

$ cat EarlySub.cfg 
SW_DIR                 = /opt/path/to/compilers
preENV                 = 1

$ cat EarlySub.sh
runcpu --config=EarlySub --fake 519.lbm_r | grep '^LD' | uniq
$ ./EarlySub.sh 
$  (Notes about examples) 

The example below uses variables defined in several named sections. The sections are delimited by section markers default: (line 8), default=base: (line 11), and default=peak: (line 15).

$ cat -n square.cfg
     1  expand_notes         = 1
     2  iterations           = 1
     3  output_format        = text
     4  output_root          = /tmp/square
     5  runlist              = 519.lbm_r
     6  size                 = test
     7  tune                 = base,peak
     8  default:
     9     CC                = gcc
    10     CC_VERSION_OPTION = -v
    11  default=base:
    12     the_system        = STAR
    13     OPTIMIZE          = -O1
    14     notes_base_100    = base tuning uses '$[CC]' '$[OPTIMIZE]' on system '$[the_system]'
    15  default=peak:       
    16     OPTIMIZE          = -O2
    17     notes_peak_100    = peak tuning uses '$[CC]' '$[OPTIMIZE]' on system '$[the_system]'
$ cat square.sh 
runcpu --config=square | grep txt
grep tuning /tmp/square/result/CPU2017.001.fprate.test.txt
$ ./square.sh 
    format: Text -> /tmp/square/result/CPU2017.001.fprate.test.txt
     base tuning uses 'gcc' '-O1' on system 'STAR'
     peak tuning uses 'gcc' '-O2' on system ''
$  (Notes about examples) 

Note that line 14 finds all three variables that it is looking for, but line 17 does not. If it is not clear why this happens, please see the descriptions of named sections and precedence above.

I.F.3. Useful $[square] variables

Useful $[variables] include:

$[configfile] Your config file name
$[configpath] The full path to your config file
$[dirprot] protection that is applied to directories created by runcpu
$[endian] 4321 or 87654321 for big endian; 1234 or 12345678 for little
$[flag_url_base] directory where flags files are looked up
$[OS] unix or windows
$[os_exe_ext] exe for windows, nil elsewhere
$[realuser] the user name according to the OS
$[top] the top directory of your installed SPEC CPU2017 tree
$[username] the username for purposes of tagging run directories
$[uid] the numeric user id

You can access the initial value of most options that you can enter into a config file, including:

action, allow_label_override, backup_config, basepeak, check_version, command_add_redirect, copies, delay, deletework, difflines, env_vars, expand_notes, expid, fake, feedback, flagsurl, http_proxy, http_timeout, ignore_errors, ignore_sigint, info_wrap_columns, iterations, label, line_width, locking, log_line_width, mail_reports, mailcompress, mailmethod, mailport, mailserver, mailto, make, make_no_clobber, makeflags, mean_anyway, minimize_builddirs, minimize_rundirs, no_input_handler, no_monitor, notes_wrap_columns, notes_wrap_indent, output_format, output_root, plain_train, rawformat, rebuild, reportable, runlist, section_specifier_fatal, setprocgroup, size, strict_rundir_verify, sysinfo_program, table, teeout, train_single_thread, tune, use_submit_for_compare, use_submit_for_speed, username, verbose, verify_binaries, version_url.

I.F.4. ${curly} substitution (interpolation) during a run
+ $unbracketed substitution

During a run, variables that are delimited by ${curly brackets} are substituted: ${command}
Usually, variables can be spelled with or without the curlies: $command or ${command}
Exception 1: curlies are required if the variable is adjacent to other text ${command}s
Exception 2: curlies are not allowed for: $BIND and $SPECCOPYNUM

Runcpu uses perl interpolation.
Only scalars (not: perl arrays and hashes) can be interpolated.

Example: on the notes100 line, you could optionally add say either ${lognum} or $lognum, but don't try taking the curly brackets away from ${size}.

$ cat just.cfg 
expand_notes      = 1
notes100          = Just ${size}ing, in run $lognum 
output_root       = /tmp/just
runlist           = 505.mcf_r
size              = test
CC                = gcc
$ cat just.sh
runcpu -c just | grep txt
grep Just /tmp/just/result/CPU2017.001.intrate.test.txt
$ ./just.sh 
    format: Text -> /tmp/just/result/CPU2017.001.intrate.test.txt
     Just testing, in run 001
$  (Notes about examples) 

Traps for the unwary

  1. Try not to confuse ${curly} interpolation vs. %{curly} (macros)
  2. Timing: Some variables are only defined at certain times, and a line that uses it might be interpolated at a different time. Therefore interpolation won't always do what you might wish. In particular, notes are not expanded in the context of a particular benchmark run, and therefore variables such as ${tune} are not useful within them.

  3. output_root:
    You cannot set an output_root that depends on a runcpu variable.
    You can set one that uses a macro:

    output_root=$[top]/my/path      # wrong
    output_root=${top}/my/path      # wrong
    output_root=%{ENV_SPEC}/my/path # right

I.F.5. Useful ${curly} variables

These variables may be of interest:

${baseexe} The first part of the executable name, which is <baseexe>_<tune>.<label>. For example, in "lbm_r_base.foo", baseexe is "lbm_r".
${benchmark} The number and name of the benchmark currently being run, for example 519.lbm_r
${benchname} The name of the benchmark currently being run, for example lbm_r
${benchnum} The number of the benchmark currently being run, for example 519
${benchtop} The top directory for the benchmark currently being run, for example /spec/cpu2017/benchspec/CPU/519.lbm_r
$BIND A value from your bind list, typically a numeric identifier for a processor on your system. This variable is actually interpreted by specinvoke, and cannot be spelled with braces. Say $BIND, do not say ${BIND}.
${command} The shell command line to run the current benchmark, for example ../run_base_test_sticky.0000/lbm_r_base.sticky 20 reference.dat 0 1 100_100_130_cf_a.of
${commandexe} The executable for the current command, for example ../run_base_test_none.0000/lbm_r_base.sticky
${label} The label for the benchmark being run
${iter} The current iteration number
${logname} The complete log file name, for example /spec/cpu2017/result/CPU2017.168.log
${lognum} The log file number, for example 168
$SPECCOPYNUM The current copy number, when running a SPECrate run. This variable is actually interpreted by specinvoke, and cannot be spelled with braces. Say $SPECCOPYNUM, do not say ${SPECCOPYNUM}. The first copy is 0 (zero).
SPECUSERNUM Do not use. This is the older, obsolete, CPU2000 spelling for what is now called SPECCOPYNUM. If you use it, it will be silently ignored - no warning is printed.
${tune} The tuning for the benchmark being run (base or peak)
${workload} The current workload number (within the iteration)

For a complete list of the available variables relative to the current config file, set

expand_notes = 1
verbose      = 35 # or higher

Then, do a run that causes a command substitution to happen.
In the log, you will find many lines of the form:

 - Variables available for interpolation that have changed since the last list:
    (From config) $runmode = "rate"
    (From config) $size = "test"
 - Variables available for interpolation that have changed since the last list:
    (From config) $size = "train"

I.F.6. Unsetting a variable with "%undef%"

It is sometimes useful to undo the setting of a variable that would otherwise be included from another section. This can be accomplished using the special value %undef%. In the example, line 14 undefines COPTIMIZE when compiling peak:

 $ cat -n gnana.cfg 
 1  flagsurl             = $[top]/config/flags/gcc.xml
 2  iterations           = 1
 3  output_format        = text
 4  output_root          = /tmp/undef
 5  runlist              = 519.lbm_r
 6  size                 = test
 7  tune                 = base,peak
 8  default:
10     CC                = gcc
11     OPTIMIZE          = -O2
12     COPTIMIZE         = -fno-tree-pre
13  519.lbm_r=peak:
14     COPTIMIZE         = %undef%
$ runcpu --config=gnana | grep txt
    format: Text -> /tmp/undef/result/CPU2017.001.fprate.test.txt
$ cd /tmp/undef/benchspec/CPU/519.lbm_r/build
$ grep OPTIMIZE build_base_none.0000/Makefile.spec 
COPTIMIZE        = -fno-tree-pre
OPTIMIZE         = -O2
$ grep OPTIMIZE build_peak_none.0000/Makefile.spec 
COPTIMIZE        = 
OPTIMIZE         = -O2
$ (Notes about examples)   

Notice that in the build directory, COPTIMIZE is present for base and absent for peak.

I.F.7. Debug tips for runcpu variables

When debugging a config file that uses runcpu variables, try:

iterations       = 1
minimize_rundirs = 0
reportable       = 0
runlist          = (one or two benchmarks)
size             = test 
verbose          = 40

Using --fake will probably be informative. Look inside the log for the (case-sensitive) word 'From'.

II. Config file options for runcpu

This section documents options that control the operation of runcpu itself.

II.A. Precedence: config file vs. runcpu command line

In the list that follows, some items are linked to the document Using SPEC CPU2017 - the 'runcpu' Command because they can be specified either in a config file, or on the runcpu command line.

New with CPU2017, If an option is specified in both places, the command line wins.

II.B. Options

In the table that follows, if an option is documented as accepting the values "no" and "yes", these may also be specified as "false" and "true", or as "0" and "1".

The "Use In" column indicates where the option can be used:

H use only in header section
N use in a named section.
H,N can be used in both the header section and in named sections. The item can therefore be applied on a global basis, and/or can be applied to individual benchmarks.
Option Use In Default Meaning
action H validate What to do. The available actions are defined in the runcpu document. See also the buildsetup example in the section on specmake.
allow_label_override H no

The runcpu command can use --label to select sections that apply to a specific label. If the label mentioned on the runcpu command does not occur in any section marker:

  • If allow_label_override is no (or 0 or false), the tools will print an error message and exit.
  • If allow_label_override is yes (or 1 or true), then the tools will attempt to build (or run) the same binary they would have built by default, but will name the binary with specified label.
This option was known as allow_extension_override prior to CPU2017.
backup_config H yes When updating the hashes in the config file, make a backup copy first. Highly recommended to defend against full-file-system errors, system crashes, or other unfortunate events.
Option Use In Default Meaning
basepeak H,N no Use base binary and/or base result for peak. If applied to the whole suite then only base is run, and its results are reported for both the base and peak metrics. If applied to a single benchmark, the same binary will be used for both base and peak runs, and the base results will be reported for both. (The reason for running the binary during both base and peak is to remove the possibility that skipping a benchmark altogether might somehow change the performance of some other benchmark.)
H,N ''

List of values to substitute for $BIND in a submit command. This can be a simple comma (or white-space) separated list, such as:

 bind = 0,1,2,3,4,5,6,7, 16,17,18,19,20,21,22,23     

If your line is too long, don't try to continue it by adding a numeral to 'bind'; that has a different meaning, as described in the next section. Instead, you can continue it either by putting a backslash at the end of lines, or by using a here document. The above single line could equally well be expressed as:

bind = <<EOT
0,   1,   2,   3,   4,   5,   6,   7,
16, 17,  18,  19,  20,  21,  22,  23        

The trailing comma after the "7" is both optional and harmless. It can be present or not, as you wish; in either case, the sequence of values for this example will include ...5, 6, 7, 16, 17, 18...

H,N ''

List of strings to substitute for $BIND in a submit command. Put each string on a new line, increasing the index n. Like notes, the values of the indices are not important and are used for ordering only. If there are more copies than bind values, they will be re-used in a circular fashion. If there are more bind values specified than copies, then only as many as needed will be used.

$ cat bindN.cfg 
copies      = 4           
iterations  = 1         
output_root = /tmp/submit
runlist     = 541.leela_r
size        = test       
   bind0    = assign_job cpu_id=11
   bind1    = assign_job cpu_id=13
   bind2    = assign_job cpu_id=17
   bind3    = assign_job cpu_id=19
   submit   = $BIND ${command}
$ runcpu --fake --config=bindN | grep '^assign' | cut -b 1-70
assign_job cpu_id=11 ../run_base_test_none.0000/leela_r_base.none test
assign_job cpu_id=13 ../run_base_test_none.0000/leela_r_base.none test
assign_job cpu_id=17 ../run_base_test_none.0000/leela_r_base.none test
assign_job cpu_id=19 ../run_base_test_none.0000/leela_r_base.none test
$(Notes about examples)    
Option Use In Default Meaning
check_version H no (yes for

When set, before doing a reportable run, runcpu will download a small file (~15 bytes) from www.spec.org containing the current version of the suite and the date it was released, and check your copy vs. that file. In this way, you can be notified if the version of the suite that you're using is out-of-date.

Setting this variable to no will disable this check.

If you'd like to check a local file instead, you can modify version_url to point to your internal copy.

If you would like to check your version for a NON-reportable run, you will need to add --check_version to your command line. Setting check_version=yes in the config file only causes the check for reportable runs.

command_add_redirect H no

If set, the generated ${command} will include redirection operators (stdout, stderr), which are passed along to the shell that executes the command. If this variable is not set, specinvoke does the redirection. This option is commonly used with fdo_run or submit. See Tip #3 under FDO Example 5.

When used with submit, the command_add_redirect feature lets you choose whether redirection operators (such as <this_benchmark.in or >that_benchmark.out) are applied to your entire modified submit command (the default) or just to the portion that has ${command}.
In pseudo-code:

A bind list has processor numbers n; a benchmark runs exe; it sends output to out.
If you use:
submit = send_job job="${command}" cpu_id=$BIND
   Do you prefer this?
         send_job job="exe >out" cpu_id=n
Or this?
         send_job job="exe" cpu_id=n >out

The above is only pseudo-code; see the section on Using Submit for real examples.

Option Use In Default Meaning
copies H,N 1

Number of copies. For base, the number of copies must be the same for all benchmarks, but for peak it is allowed to vary: for example, you could decide to run 64 copies of all benchmarks except 519.lbm_r, which would run only 63.

Note: If you select basepeak=yes for a benchmark, the number of copies in peak will be forced to be the same as in base.

current_range H,N none Set the maximum current in amps to be used by the power analyzer(s). This can be used to control the settings on a per benchmark level (named section) or across all benchmarks (header section).
delay H,N 0 Insert a delay of the specified number of seconds before and after benchmark execution. This delay does not contribute to the measured runtime of the benchmark. This delay is also not available in a reportable run.
deletework H,N no If set to yes, always delete existing benchmark working directories. An extra-careful person might want to set this to ensure no unwanted leftovers from previous benchmark runs, but the tools are already trying to enforce that property.
difflines H,N 10 Number of lines of differences to print when comparing results.
Option Use In Default Meaning
enable_monitor H,N yes

If this variable is set to no, then all of the monitoring hooks are disabled. This can be overridden by setting force_monitor. force_monitor is new with CPU2017

H,N no

If set to yes, environment settings can be changed using ENV_* options in the config file. Note that you cannot change OMP_NUM_THREADS using this feature. Use threads.

Example: Consider the config file below, which creates a binary with the requested label:

$ cat PerformabilityQOS.cfg
runlist     = 520.omnetpp_r
iterations  = 1
size        = test
tune        = peak
label       = srini
   CXX                 = g++
   OPTIMIZE            = -O1
$ date
Mon Oct  3 17:49:46 PDT 2016
$ runcpu -c PerformabilityQOS | grep -i -e error: -e success
Build successes for intrate: 520.omnetpp_r(peak)
Success: 1x520.omnetpp_r
$ go 520.omnet exe
$ ls -g | cut -b 27-88
 Oct  3 17:52 omnetpp_r_peak.srini
$(Notes about examples)    

The binary just above depends on certain libraries. What if they aren't available, or have been moved?

$ ldd omnetpp_r_peak.srini 
... version `GLIBCXX_3.4.20' not found (required by ./omnetpp_r_peak.srini)
... version `CXXABI_1.3.8' not found (required by ./omnetpp_r_peak.srini) 

In the config directory, a second copy of the config file has been created, with the addition of env_vars and a line that inserts a new directory into the front of the library path. As shown below, this fixes the problem.

$ diff -u PerformabilityQOS.cfg PerformabilityQOS.2.cfg 
--- PerformabilityQOS.cfg	2016-10-03 17:52:39.000000000 -0700
+++ PerformabilityQOS.2.cfg	2016-10-03 17:54:45.000000000 -0700
@@ -3,8 +3,10 @@
 size        = test
 tune        = peak
 label       = srini
+env_vars    = 1
    CXX                 = g++
    OPTIMIZE            = -O1
$ runcpu -c PerformabilityQOS | grep -i -e error: -e success
error: a total of 1 children finished with errors
$ runcpu -c PerformabilityQOS.2 | grep -i -e error: -e success
Success: 1x520.omnetpp_r

Notice above that the original fails, and the modifed version with env_vars succeeds. Other notes:

  1. The syntax %{ENV_x} uses the preprocessor to import both the definition for SPEC and the definition for LD_LIBRARY_PATH as they were at runcpu startup. If the definitions for SPEC and LD_LIBRARY_PATH at the start of the run are, respectively, /A and /B, then runcpu will add 'LD_LIBRARY_PATH=/A/libraries:/B'  to the environment when it invokes the executable omnetpp_r_peak.srini.
  2. For SPEC CPU2017, the environment must be consistent during base (rule 3.4). You may vary it for peak.
  3. When developing a config file that uses env_vars, you may find it useful to set verbose to 35 (or higher), which will cause the tools to log environment settings.
  4. Which environment? If you are attempting to set the environment for your entire run, this is not the feature that you are looking for. Try preenv instead. The env_vars option and ENV* are about communication from the config file to the environment of the invoked benchmark.
Option Use In Default Meaning
expand_notes H no If set, will expand variables in notes. This capability is limited because notes are NOT processed by specmake, so you cannot do repeated substitutions.
expid H If set to a non-empty value, will cause executables, run directories, results, and log files to be put in a subdirectory (with the same name as the value set) in their normal directories. For example, setting expid = CDS will cause benchmark binaries to end up in exe/CDS, run directories to end up in run/CDS, and results and logs in $SPEC/result/CDS.
Option Use In Default Meaning
fail H,N no If set, will cause a build or run to fail.
fail_build H,N 0

If set, will cause a build to fail. For example, you could say something like this:

#> I am posting this config file for use by others in the
#> company, but am forcing it to fail here because
#> I want to force users to review this section.
#> Once you find your way here, you should test whether
#> bug report 234567 has been fixed, by using the first
#> line below.  If it has not been fixed, then use the
#> second.  In either case, you'll need to remove the
#> fail_build.
#>   - Pney Guvaxre
#>     Boomtime, the 66th day of Confusion in the YOLD 3172

# OPTIMIZE = -Osuperduper
# OPTIMIZE = -Omiddling
fail_build = yes

In the example above, the build is forced to fail until the user examines and modifies that section of the config file. Notice that Pney has used protected comments to cause the comments about the internal bug report to disappear if the config file were to be published as part of a reportable run.

fail_run H,N no If set, will cause a run to fail.
Option Use In Default Meaning
feedback H,N yes

The feedback option applies an on/off switch for the use of feedback directed optimization (FDO), without specifying how the feedback will be done.

  • If set to no, then feedback will not be used, even if other options would normally cause it to occur.
  • If set to yes, then feedback is allowed to be used, provided that additional options are also set to describe how the FDO build should be carried out.

The interaction between feedback and these other options is described in the section on Using Feedback, below.

flagsurl H none

If set, retrieve the named URL or filename and use that as the "user" flags file. If the special value "noflags" is used, runcpu will not use any file and (if formatting previously run results) will remove any stored file. Automated processing of flags is explained in flag-description.html.

If you want to list more than one flagsfile, the recommended method is by using numbered continuation lines, for example:

     flagsurl1 = mycompiler.xml
     flagsurl2 = myplatform.xml 

Using other methods (such as backslash continuation) to specify multiple flags files may appear to work, but may result in unexpected differences between the original config file and the config file as written by output format config.

Multiple flags files will typically be needed, because flags files are separated into two types, "compiler", and "platform".

force_monitor H,N no

If this variable is set to yes, then all of the monitoring hooks are enabled, regardless of settings that would otherwise turn them off. This means that every invocation of specinvoke will be subject to monitor_specrun_wrapper, and all command invocations will be subject to monitor_wrapper. This includes things that would normally not be subject to monitoring, such as FDO training runs, input generation commands, and commands used for validating benchmark output such as specdiff. force_monitor is new with CPU2017.

Option Use In Default Meaning
http_proxy H

In some cases, such as when doing version checks and loading flag description files, runcpu will use HTTP to fetch a file. If you need to specify the URL of a proxy server, this is the variable to use. By default, no proxy is used. Note that this setting will override the value of the http_proxy environment variable. For example, one might set:

http_proxy = http://webcache.tom.spokewrenchdad.com:8080 
http_timeout H 30 This is the amount of time (in seconds) to wait while attempting to fetch a file via HTTP. If the connection cannot be established in this amount of time, the attempt will be aborted.
idle_current_range H none Set the maximum current in amps to be used by the power analyzer(s) for the idle power measurement.
ignore_errors H no Ignore certain errors which would otherwise cause the run to stop. Very useful when debugging a new compiler and new set of options: with this option set, you'll find out about all the benchmarks that have problems, instead of only finding out about the first one.
ignore_sigint H no Ignore SIGINT. If this is set, runcpu will attempt to continue running when you interrupt one of its child processes by pressing ^C (assuming that you have ^C mapped in the common way). Note that this does NOT cause runcpu itself to ignore SIGINT.
Option Use In Default Meaning
info_wrap_columns H 50 When set to a value greater than 0, attempts to split non-notes informational lines such that they are no longer than info_wrap_columns columns wide. Lines are split on whitespace, and newly created lines are guaranteed to have at least the same indentation as the original line. If a line contains an item that is longer than info_wrap_columns, a warning is logged and the original line is left unchanged.
Option Use In Default Meaning
iterations H 3 Number of iterations to run.
keeptmp H no

Whether or not to keep various temporary files. If you leave keeptmp at its default setting, temporary files will be automatically deleted after a successful run. If not, temporary files may accumulate at a prodigious rate, and you should be prepared to clean them by hand. Temporary files include:

  • CPU2017.nnn.log.debug in your result directory ($SPEC/result, %SPEC%\result, or the corresponding location under your output_root).
  • A directory named tmp/ under one of these places:

    $SPEC/               or        %SPEC%\
    Whatever you set for an output_root directory
  • If tmp/ is not usable, you might find temporaries under tmp.cpu2017 or, possibly, tmp.cpu2017.n, where n may range from 1 to 15.
label H none

An arbitrary tag for executables, build directories, and run directories.
Legal characters are: alphanumerics, underscore, hyphen, and period. For CPU2017, this field replaces the field ext


If a label is used as a section specifier, it can be referenced from the runcpu command line.
If a runcpu command uses a label that is not defined in a config file, by default the run will stop with an error:

runcpu --label=yusoff [...]
ERROR: The label 'yusoff' defines no settings in the config file!

The error can be disabled if you set allow_label_override=yes.

Tip: Make a habit of setting the label to anything that is meaningful to you. Doing so makes it easier to compare binaries, easier to track history, and easier to clean up by removing directories that match the label. Example:

This config file runs 519.lbm_r with 3 copies of the (very short) test workload. It uses differing tuning depending on the label:

copies               = 3
output_format        = text
runlist              = 519.lbm_r
   CC                = gcc
   OPTIMIZE          = -O0
   OPTIMIZE          = -O3 

The label can be used when hunting down directories - either to analyze them or to delete them, as in the Unix find command. On other systems, try searching in a GUI file manager.

$ find . -type d -name '*OhZ*'
$ cd benchspec/CPU/519.lbm_r/build
$ grep OPTIMIZE */Makefile.spec
build_base_OhThree.0000/Makefile.spec:OPTIMIZE   = -O3
build_base_OhZero.0000/Makefile.spec:OPTIMIZE   = -O0
Option Use In Default Meaning
line_width H 0 Line wrap width for screen. If left at the default, 0, then lines will not be wrapped and may be arbitrarily long.
locking H yes Try to use file locking to avoid race conditions, e.g. if more than one copy of runcpu is in use. Although performance tests are typically done with only one copy of runcpu active, it can be handy to run multiple copies if you are just testing for correctness, or if you are compiling the benchmarks.
log_line_width H 0 Line wrap width for logfiles. If your editor complains about lines being too long when you look at logfiles, try setting this to some reasonable value, such as 80 or yes32. If left at the default, yes, then lines will not be wrapped and may be arbitrarily long.
log_timestamp H no Whether or not to prepend time stamps to log file lines.
mailcompress H no When using the 'mail' output format, turning this on will cause the various report attachments to be compressed with gzip.
mailmethod H smtp When using the 'mail' output format, this specifies the method that should be used to send the mail. On UNIX and UNIX-like systems, there are three choices: 'smtp' (communicate directly with an SMTP server over the network), 'sendmail' (try invoking sendmail directly from locations where it is commonly installed), and 'qmail' (try invoking qmail-inject from locations where it is commonly installed). On Windows systems, only 'smtp' is available. SMTP is the recommended setting.
Using a sendmail or qmail-inject program from a non-standard location is possible only by setting the PERL_MAILER environment variable. See the Mail::Mailer documentation for details.
mailport H 25 When using the 'mail' output format, and when the mailmethod is 'smtp', this specifies the port to use on the mail server. The default is the standard SMTP port and should not be changed.
mailserver H When using the 'mail' output format, and when the mailmethod is 'smtp', this specifies the IP address or hostname of the mailserver through which to send the results.
Option Use In Default Meaning
mailto H ''

The address or addresses to which results should be sent when using the 'mail' output format. If multiple addresses are specified, they should be separated by commas or whitespace. Each address should consist only of the name@domain part (i.e. no "full name" type info). The addresses are not checked for correct formatting; if a mistake is made, the results may be sent to an unknown location. Think: comp.arch. OK, probably not there, but seriously be careful about security on this one.

Config files as posted at www.spec.org/cpu2017 will not include whatever you put on this line (thus, spambots will not see the contents of this field).

Note that to get your reports mailed to you, you need to specify both mail as an output_format and an address to which they should be mailed. For example:


If no addresses are specified, no mail will be sent.

mail_reports H all The list of report types to mail. The format and possible values are the same as for output_format, with the addition of log, which will cause the current log file to be sent. The default is for all files associated with the run to be mailed (so, this will include what you listed as your desired output_format plus log (the log file) and rsf (the rawfile). You can cut your email down to the bare essentials with something like this:
If none of the requested report types were generated, no mail will be sent.
Option Use In Default Meaning
make H,N specmake Name of make executable. Note that the tools will enforce use of specmake for reportable results.
make_no_clobber H,N no Don't delete directories when building executables. The default is no, meaning "clobber". The "yes" option, meaning "avoid clobbering", should only be used for troubleshooting a problematic compile. The tools will not allow you to use this option when building binaries for a reportable result. Note that you could issue multiple successive runcpu commands with this option set (either in the config file, or with the --make_no_clobber switch), and the build directories will be preserved. But once you remove make_no_clobber (allowing it to default back to no), then the tools will attempt a normal build with a fresh build directory.
makeflags H,N ''

Extra flags for make (such as -j). Set this to -j n where n is the number of concurrent processes to run during a build. Omitting n or setting it to zero unlimits the number of jobs that will be run in parallel. Use with care! Make flags should be used here only if you are familiar with GNU make. (The program specmake is GNU Make under another name to ensure no accidental conflicts with other Make utilities you might have. The GNU Make Manual can be consulted, and you can also say specmake --help.)

Note that requesting a parallel build with makeflags = -j N causes multiple processors to be used at build time. It has no effect on whether multiple processors are used at run time, and so does not affect how you report on parallelism.

Option Use In Default Meaning
mean_anyway H no Calculate mean even if invalid. DANGER: this will write a mean to all reports even if no valid mean can be computed (e.g. half the benchmarks failed). A mean from an invalid run is not "reportable" (that is, it cannot be represented in public as the SPEC metric).
minimize_rundirs H no During a run, try to keep working disk size down. Cannot be used in a reportable run.
minimize_builddirs H no Try to keep working disk size down during builds.
nobuild H no Do not attempt to build benchmarks. Useful to prevent attempts to rebuild benchmarks that cannot be built. Also comes in handy when testing whether proposed config file options would potentially force an automatic rebuild.
no_monitor H,N '' Exclude the listed workloads from monitoring via the various monitor_* hooks.
Option Use In Default Meaning
no_input_handler H,N close

Method to use to simulate an empty input. Choices are:

  • close - Just close stdin before executing the benchmark (this is the default).
  • zero - Create a zero-length file and attach it to stdin.
  • file - Synonym for zero.
  • null - Open the null device (/dev/null or NUL:) and attach it to stdin.

Normally, this option should be left at the default; it was actually added to the tools for the benefit of a different SPEC suite that needed the feature. If a reportable run for CPU2017 uses this feature, an explanation should be provided as to why it was used.

notes_wrap_columns H 0 When set to a value greater than 0, attempts to split notes lines such that they are no longer than notes_wrap_columns columns wide. Lines are split on whitespace, and newly created lines are guaranteed to have at least the same indentation as the original line. If a line contains an item that is longer than notes_wrap_columns, a warning is logged and the original line is left unchanged.
notes_wrap_indent H ' ' When line wrapping is enabled (see notes_wrap_columns), this is the string that will be prepended to newly created lines after the indentation from the original line is applied. The default is four spaces, but it can be set to any arbitrary string.
Option Use In Default Meaning
output_format H all Format for reports. Valid options are listed at runcpu.html under --output_format; major options include txt (ASCII text), html, pdf, and ps. You might prefer to set this to txt if you're going to be doing lots of runs, and only create the pretty reports at the end of the series. See also the information in runcpu.html about --rawformat.
output_root H

If set to a non-empty value, all output files will be rooted under the named directory, instead of under $SPEC (or %SPEC%).
If the output_root is not an absolute path (one that begins with "/" on Unix, or a device name on Windows), the path will be created under $SPEC or %SPEC%. You probably will find it more useful if you pick some place other than the default, for convenience when cleaning directories, allocating quotas, or optimizing I/O.

You can navigate a rooted directory with ogo.
This option can be useful for sharing an installation.
It can also can be useful to optimize your I/O for a big SPECrate run: you can put the output_root on your faster device.

Example: Van has a 500 TB optimized file system under /Ouachita. If he types

runcpu --output_root=/Ouachita/Van --config=mine 520.omnetpp_r
These directories will be created For these purposes
log files and reports /Ouachita/Van/result/
520.omnetpp_r build directories /Ouachita/Van/benchspec/CPU/520.omnetpp_r/build/
520.omnetpp_r benchmark binaries (executables) /Ouachita/Van/benchspec/CPU/520.omnetpp_r/exe/
520.omnetpp_r run directories /Ouachita/Van/benchspec/CPU/520.omnetpp_r/run/
Option Use In Default Meaning
parallel_test H #base

For reportable runs, the tools verify that benchmark binaries get the correct answer for the test and train workloads. The time required for such verification does not count toward the reported score, but does add to the cost of benchmarking. To reduce that cost, you can run multiple tests simultaneously, by setting this option to anything higher than 1.

For SPECrate runs, the default is the number of base copies. For SPECspeed runs, the default is 1, that is, effectively off, but the feature can be turned on by setting it to something higher than 1.

The feature can be enabled for non-reportable runs by setting parallel_test_workloads to one or more of "test" "train", or "ref".

If the feature is enabled, and if parallel_test_submit is also set, the settings for submit and bind will be used to distribute jobs if they're set.


  1. The parallel test features (parallel_test, and parallel_test_submit) control parallelism during the runs of the test and train workloads for reportable runs. They do not affect how the binary is built; therefore, they have no effect on the setting of the report field "Parallel: Yes/No", discussed below.
  2. The parallel test feature is not available on Microsoft Windows.

The above notes apply to all the parallel_test switches.

parallel_test_submit H no

Whether or not to use your submit and bind settings when doing parallel_test. When you specify parallel_test > 1, by default your submit and bind settings are not applied, because typically the operating system's default policies can be trusted to do an adequate job scheduling this (non-timed) work.

Notes: See notes under parallel_test, above.

parallel_test_workloads H no

Which workload classes to run in parallel for non-reportable runs when parallel_test is enabled. The parallel_test_workloads feature is new with CPU2017.

Notes: See notes under parallel_test, above.

Option Use In Default Meaning
plain_train H,N yes When set to yes (or true or 1), does not apply any submit commands to the feedback training run. It also causes the monitor_* hooks to be ignored for the feedback training run.
power H none When this is set to yes, it tells the tools to collect power data during the run. Power measurement is new with CPU2017. For V1.0, it is considered experimental and non-comparable.
power_analyzer H none This provides a list of names that are associated with the power analyzers the tools will use to communicate with the SPEC Power/Temperature Daemon (PTD). These names will then be used in the descriptive fields options which relate to describing the power analyzers. When used with the descriptive fields, only the letters and numbers are used in the {id} portion of the field.
H yes

Use preENV_ lines in the config file. When this option is set (the default), lines of the form preENV_<variable> = <value> will cause runcpu to set the specified environment variable to value and re-exec runcpu to perform the run. The restart is done in order to ensure that the entire run takes place with the new settings. You can set preENV_SOME_VARIABLE = value only in the header section or a section using one these benchmark specifiers:


Any attempt to use preENV_ in other sections is silently ignored.
You cannot change OMP_NUM_THREADS using this feature. Use threads.
Multiple preENV_ settings may appear in the config file.
If you are looking for a way to affect the environment of an individual benchmark, try env_vars, not preenv.

Example: Above, the section on env_vars showed how a library path can be adjusted for a single benchmark.
Instead of adjusting one benchmark, the environment could be set for all benchmarks by putting a line for preENV_LD_LIBRARY_PATH at the top of the config file:

$ grep -i env PerformabilityQOS.cfg      # none used: failure
$ runcpu -c PerformabilityQOS.cfg | grep -i -e error: -e success
error: a total of 1 children finished with errors
$ grep -i env PerformabilityQOS.2.cfg      # env_VAR: works  
env_vars    = 1
$ runcpu -c PerformabilityQOS.2.cfg | grep -i -e error: -e success
Success: 1x520.omnetpp_r
$ grep -i env PerformabilityQOS.3.cfg    # preENV: also works
$ runcpu -c PerformabilityQOS.3.cfg | grep -i -e error: -e success
Success: 1x520.omnetpp_r
Option Use In Default Meaning
reportable H yes Strictly follow reporting rules, to the extent that it is practical to enforce them by automated means. The tester remains responsible for ensuring that the runs are rule compliant. You must set reportable to generate a valid run suitable for publication and/or submission to SPEC.
rebuild H no Rebuild binaries even if they exist.
runlist H none Benchmarks or sets to run. Names can be abbreviated, just as on the command line. See the long discussion of run order in runcpu.html.
save_build_files H,N none

After a build is finished, files matching any of the space-delimited wildcard patterns in this variable will be gathered up and saved. When a non-reportable run is being set up, those files will be unpacked into the run directory when the executable is copied in. The auxiliary file package is ignored when setting up reportable runs. The save_build_files feature is new with CPU2017.

section_specifier_fatal H yes

While parsing the config file, if a section specifier is found that refers to an unknown benchmark or benchset, an error is output and the run stops. Set section_specifier_fatal=no in the header section of your config file to convert this error into a warning and allow the run to continue. The ability to convert section specifier errors into warnings is probably of use only for benchmark developers.

Option Use In Default Meaning
setprocgroup H yes Set the process group. On Unix-like systems, improves the chances that ^C gets the whole run, not just one of the children.
size H ref

Size of input set: test, train, or ref
If you are in the early stages of testing a new compiler or new set of options, you might set this to test or train.
The reference workload (ref) is the only size whose time appears in reports.

You might choose to use runcpu --size=test while debugging a new set of compilation options.

Reportable runs automatically invoke all three sizes: they ensure that your binaries can produce correct results with the test and train workloads and then run the ref workload either 2 or 3 times for the actual measurements.

Caution: When requesting workloads, you should use only the terms test, train, or ref. You should stop reading here.

Still reading? Sigh. OK, here are the possibilities.

  • Other options might be used by other benchmarking suites
  • For the reference workload, although it is possible to be more specific than "ref" by using "refrate" or "refspeed", this is not recommended because one of two things will happen: (1) Either you will ask for the same reference workload that the tools would have given you anyway (namely: refrate for 5xx, refspeed for 6xx); or (2) You will ask for the wrong one, and end up in a state that SPEC does not support, and which cannot produce any metrics.
  • If you are doing benchmark research and development, you could play with other workloads. In that case, have you done convert_to_development yet? You should.
Option Use In Default Meaning
src.alt N none

Name of a SPEC-approved alternate source.
Set the option in the named section for the benchmark(s) where you wish to use it.
Multiple src.alts may be specified a single benchmark. Separate them wth commas.
The effect of using a src.alt is to modify the sources in the build directory.
You may also spell this option as srcalt.


About alternate sources: Sometimes a portability issue may require use of different source code for a benchmark, and SPEC may issue a src.alt, which is a compressed tar file containing modifications, created by makesrcalt.

To use a src.alt, see the instructions posted with it at www.spec.org/cpu2017/src.alt.
Typically, you begin by downloading it to your $SPEC or %SPEC% directory and unpacking it:

$ cd $SPEC   # or on Microsoft Windows: cd %SPEC%
$ specxz -dc nnn.benchmark.FixMumble.tar.xz | spectar -xvf -
$ cat README.nnn.benchmark.src.alt.FixMumble.txt

The README will explain what to add to your config file and any other instructions that are needed.

After you unpack it, a directory is created under under <benchmark>/src/src.alt/ and a set of patches are stored there. You can look at the patches using dumpsrcalt, but it may be easier to just apply the src.alt and look at a build directory.

Example: This config file builds with or without a src.alt, depending on the runcpu setting for the --label.

$ cat testme.cfg
action               = buildsetup
runlist              = nnn.benchmark
tune                 = base
   CC                = gcc
   OPTIMIZE          = -O2
   OPTIMIZE          = -O2
   srcalt            = FixMumble

To populate build directories:

runcpu --label=without --config=testme
runcpu --label=with    --config=testme

Then, if you visit benchspec/CPU/nnn.benchmark/build you can compare the directories.

Option Use In Default Meaning
strict_rundir_verify H yes

When set, the tools will verify that the file contents in existing run directories match the expected checksums. Normally, this should always be on, and reportable runs will force it to be on. Turning it off might make the setup phase go a little faster while you are tuning the benchmarks.

Developer notes: setting strict_rundir_verify=no might be useful when prototyping a change to a workload or testing the effect of differing workloads. Note, though, that once you start changing your installed tree for such purposes it is easy to get lost; you might as well keep a pristine tree without modifications, and use a second tree that you convert_to_development.

sysinfo_program H 'specperl $[top]/bin/sysinfo'

The name of an executable program or script that automatically records information about your system configuration. It creates a record that is contemporaneous with the measurement, and which is not subject to human transcription error.

New with CPU2017: Warning: published results must use the SPEC-supplied sysinfo.

If you would like to turn the feature off (perhaps during development efforts), you can use:

sysinfo_program = 

Remember to turn it on again when you do your "real" runs. To use a different one (presumably with SPEC's approval - see warning just above), add a line near the top of your config file (i.e. in the header section):

sysinfo_program = <path_to_your_sysinfo_program>

Details about the sysinfo utility may be found in SPEC CPU2017 Utilities, including how to selectively enable output types, how to resolve conflicting field warnings, and how to write your own sysinfo utility.

Option Use In Default Meaning
table H yes In ASCII reports, include information about each execution of the benchmark.
teeout H no Run output through tee so you can see it on the screen. Primarily affects builds, but also provides some information about progress of runtime, by showing you the specinvoke commands.
train_single_thread H,N no Ensure that feedback training runs are done using only a single thread. The train_single_thread feature is new with CPU2017.
train_with H,N train

Select the workload with which to train binaries built using feedback-directed optimization.

The ability to train with alternate workloads would not normally be applicable to CPU2017; the feature was added for the benefit of a different suite that uses the same toolset. Nevertheless, it could be used, for example when studying the efficacy of different training methods, as follows: (1) First convert your tree to a development tree; (2) place your new training workload under nnn.benchmark/data/myworkload. (3) Give it the same structure as the existing training workload: an input/ directory, an output/ directory, and a reftime file with contents similar to the one found in nnn.benchmark/data/train/reftime.

For reportable runs, you cannot use binaries that were trained with alternate workloads.

Option Use In Default Meaning
temp_meter H none This provides a list of names associated with the temperature meters tools will use to communicate with the SPEC Power/Temperature Daemon (PTD). These names will then be used in the descriptive fields options which relate to describing the temperature meters. When used with the descriptive fields, only the letters and numbers are used in the {id} portion of the field.
threads H,N 1 Value to be set for OMP_NUM_THREADS when benchmarks are run.
If you have set OMP_NUM_THREADS in your environment prior to using runcpu, that setting will be ignored. To use multiple benchmark threads, you must use either this config file field or the corresponding option runcpu --threads=N.
tune H base default tuning level. In a reportable run, must be either all or base.
use_submit_for_compare H,N no If set, use submit commands for benchmark validation commands if submit was used for the run itself. This feature is new with CPU2017.
use_submit_for_speed H,N no If set, use submit commands for SPECspeed runs as well as SPECrate runs.
verbose H 5 Verbosity level. Select level 1 through 99 to control how much debugging info runcpu prints out. For more information, see the section on log files, below.
verify_binaries H yes

runcpu uses checksums to verify that executables match the config file that invokes them, and if they do not, runcpu forces a recompile. You can turn that feature off by setting verify_binaries=no.

Warning: It is strongly recommended that you keep this option at its default, yes (that is, enabled). If you disable this feature, you effectively say that you are willing to run a benchmark even if you don't know what you did or how you did it -- that is, you lack information as to how it was built!

The feature can be turned off because it may be useful to do so sometimes when debugging (for an example, see env_vars), but it should not be routinely disabled.

Since SPEC requires that you disclose how you build benchmarks, reportable runs (using the command-line switch --reportable or config file setting reportable=yes) will cause verify_binaries to be automatically enabled. For CPU2017, this field replaces the field check_md5

version_url H http://
If version checking is enabled, this specifies the location from which the version information should be fetched.
voltage_range H none Set the maximum voltage in volts to be used by the power analyzer(s) for power measurement.

III. Config file options for specmake

For SPEC CPU you do not write Makefiles. Instead, you set Make variables in the config file, which are sent to a SPEC-supplied copy of GNU Make, known as specmake. Variables with a dollar sign and parentheses, aka "round brackets", are substituted by specmake. For example:

COMPILER_DIR = /usr/local/bin/
CC           = $(COMPILER_DIR)cc
CXX          = $(COMPILER_DIR)c++
FC           = $(COMPILER_DIR)f90

See below for more information on syntax of variables that you create and reference.

III.A. Commonly used Make variables

The following Make variables are frequently useful. When selecting where to put a flag, please bear in mind that the run rules require that portability flags must use PORTABILITY variables.

CC How to invoke your C compiler.
CXX How to invoke your C++ compiler.
FC How to invoke your Fortran compiler.

New with SPEC CPU2017: You must specify how to ask each compiler "Please tell me your version" because the method varies from compiler to compiler. The version information is recorded contemporaneously with the build.

Here are a few examples from the $SPEC/config (or %SPEC%\config) directory as of Apr-2017 (there are more examples on your installed copy).

Example-PGI-linux-x86.cfg                CC_VERSION_OPTION  = -V
Example-gcc-linux-x86.cfg                CC_VERSION_OPTION  = -v
Example-intel-compiler-linux-rate.cfg    CC_VERSION_OPTION  = --version
Example-intel-compiler-windows-rate.cfg  CC_VERSION_OPTION  = -QV
Example-studio-solaris.cfg               CC_VERSION_OPTION  = -V
Example-xl-linux-ppc64le.cfg             CC_VERSION_OPTION  = -qversion=verbose

You must specify the option for all compilers that you use.
If you change compilers for some benchmarks, be sure to also change the corresponding version option.

In the example below, most benchmarks use the first three compilers. For intrate peak and fpspeed peak, different compilers are chosen, and the config file changes *both* the compiler variable and the compiler version option.

   CC                 = /bin/xlc          
   CC_VERSION_OPTION  = -qversion=verbose

   FC                 = /bin/gfortran    

   CXX                = /turboblaster/c++ 
   CXX_VERSION_OPTION = --print-blaster
   CC                 = /bin/gcc 
   FC                 = /bin/xlf            
   FC_VERSION_OPTION  = -qversion=verbose
Portability flags to be applied no matter what the compiler.
Portability flags to be applied to modules of the designated language
(For example, CXXPORTABILITY is for the C++ modules).
OPTIMIZE EXTRA_OPTIMIZE Optimization flags to be applied for all compilers.
Optimization flags to be applied to modules of the designated language.
EXTRA_{C|CXX|F}FLAGS Flags that are neither optimization nor portability
LIBS Libraries to add to the link line
PASSn_OPTIMIZE Flags for pass "n" compilation when using feedback-directed optimization (FDO). Typically n is either 1 or 2, for the compile done before the training run and the compile done after the training run. See the chapter on Using Feedback for more information.
PASSn_{C|CXX|F}OPTIMIZE Flags for pass "n" when compiling modules of the designated language.
Many more See chart in document Make Variables

III.B. Using OpenMP and/orAutopar

New with CPU2017: For fpspeed (SPECspeed 2017 Floating Point) and intspeed (SPECspeed2017 Integer), you may build using OpenMP and/or compiler auto-parallelization. This capability is much more likely to be useful for Floating Point, because:

III.B.1. Summary

To enable OpenMP
   OPTIMIZE = -DSPEC_OPENMP Compiler switch for OpenMP 
   OPTIMIZE = -DSPEC_OPENMP -fopenmp    (GNU) 
   OPTIMIZE = -DSPEC_OPENMP -qsmp=omp   (IBM XL)
   OPTIMIZE = -DSPEC_OPENMP -qopenmp    (Intel) 
   OPTIMIZE = -DSPEC_OPENMP -xopenmp    (Oracle Studio)
   OPTIMIZE = -DSPEC_OPENMP -mp         (PGI) 

III.B.2. SPECrate: no OpenMP. No Autopar.

III.B.3. SPECspeed: your choice

The config file fragment below demonstrates available options.

 1   intspeed:
 3   657.xz_s=peak:
 4      OPTIMIZE       = --openmp  -DSPEC_OPENMP 
 6   fpspeed:
 7      OPTIMIZE       = --openmp 
 9   603.bwaves_s=peak:
10      OPTIMIZE       = --autopar
12   619.lbm_s=peak:
13      OPTIMIZE       = --autopar

In the example above:

III.B.4. Detail - conditions for enabling OpenMP

Having considered an example, let us now look at the conditions in detail. The form of the directives in the benchmarks is:

#if (defined(_OPENMP) || defined(SPEC_OPENMP)) && !defined(SPEC_SUPPRESS_OPENMP) && !defined(SPEC_AUTO_SUPPRESS_OPENMP)

How it works:

  1. For reasons of how benchmark reporting is done, there are two duplicated controls that have precisely the same effect on sources:
    SPEC_SUPPRESS_OPENMP is available for you to set if you want to suppress OpenMP.
    SPEC_AUTO_SUPPRESS_OPENMP is used by runcpu when it automatically does so.
  2. If you set -DSPEC_SUPPRESS_OPENMP for a SPECspeed benchmark, that removes the directives.
  3. Or, if you set -DSPEC_OPENMP for a SPECspeed benchmark, that causes the directives to be visible to your compiler.
  4. If you set neither one of them, then the visibility of the directives will depend upon whether the variable _OPENMP is set. Here, it is important to note that _OPENMP is defined by the OpenMP standard. It requires that if a compiler supports a preprocessor, it must set this variable. However, the SPEC CPU benchmarks do not use vendor-native Fortran preprocessors, because the Fortran standard does not define preprocessing. Instead, the preprocessing is done by filepp (specpp), which does not do anything to OpenMP variables unless explicitly told to do so. Thus the SPEC CPU toolset may behave differently than your vendor-native compiler.

    In short, you might find yourself scratching your head wondering why a Fortran OpenMP benchmark is so much faster when you compile it outside the SPEC CPU harness vs. inside.

    Recommendation: To avoid a sore scalp (and before blaming the SPEC CPU tools!), if you wish to enable OpenMP please make sure that you always set -DSPEC_OPENMP. If you do not, then directives might or might not be visible, depending on your compiler and depending on the language.

To make this more clear, assume that we are compiling these benchmark excerpts. (For simplicity, we leave aside the duplicated control.)

From 619.lbm_s (C)
   #if (defined(_OPENMP) || defined(SPEC_OPENMP)) && !defined(SPEC_SUPPRESS_OPENMP)
   #pragma omp parallel for

From 603.bwaves_s (Fortran)
   #if (defined(_OPENMP) || defined (SPEC_OPENMP)) && !defined(SPEC_SUPPRESS_OPENMP)

Further assume that we are using compilers that enable OpenMP by setting a flag called --openmp. The inclusion of the directives will be determined by the truth tables shown below. The important difference occurs at line 7: by the time the Fortran compiler is awakened, it is too late to enable the directives, because that decision was already made by specpp.

#pragma omp parallel for
1YYYNot Included
2YYNNot Included
3YNYNot Included
4YNNNot Included
8NNNNot Included
1YYYNot Included
2YYNNot Included
3YNYNot Included
4YNNNot Included
7NNYNot Included
8NNNNot Included

III.C. Creating your own Make variables

Variables with a dollar sign and parentheses, aka "round brackets", are substituted by specmake.

Deprecated feature alert: Although it is also possible to pass information to specmake using curly brackets: ${COMPILER_DIR}, this is not recommended. Instead, you should consistently use curly brackets to address runcpu and round brackets to address specmake. It is possible that a future version of runcpu may insist on interpolating curly brackets itself, rather than allowing specmake to do so.


$ cat makevar.cfg 
action            = build
runlist           = 603.bwaves_s
   DEBUG_SYMBOLS  = --debug:symbols=expanded_info_level:42
   GEE            = -g $(DEBUG_SYMBOLS)
   GEE            = -g 

$ cat makevar.sh
runcpu --config=makevar --fake --tune=base | grep COMP:
runcpu --config=makevar --fake --tune=peak | grep COMP:
$ ./makevar.sh 
COMP: "f90 -c -o options.o -g --debug:symbols=expanded_info_level:42 <source>"
COMP: "f90 -c -o options.o -g <source>"
$  (Notes about examples) 

The config file above creates two variables options (DEBUG_SYMBOLS and GEE). Both are passed to specmake, which interprets them. The results are shown above using the runcpu --fake option.

For an extensive example of variable substitution handled by specmake, see the SPEC CPU2000 example at www.spec.org/cpu2000/docs/example-advanced.cfg. Search that file for LIBS, and note the long comment which provides a walk-through of a complex substitution handled by specmake.

III.D. The operator "+=" is available (but should be used with caution)

The operator "+=" adds to specmake variables. It may be convenient; it also may cause hard-to-diagnose bugs. Example:

$ cat tmp.cfg  
action  = build
runlist = 519.lbm_r
tune    = peak
   OPTIMIZE   = -O1
   OPTIMIZE  += --unroll
   OPTIMIZE  += --inner_unroll
   OPTIMIZE  += --outer_unroll
   OPTIMIZE   = --jelly_roll

$ runcpu --fake --config=tmp | grep lbm.c
cc -c -o lbm.o -DSPEC -DSPEC_CPU -DNDEBUG -O1 --inner_unroll --unroll --outer_unroll lbm.c
$ (Notes about examples)   

Note that the options accumulate.

Caution: although the += operator adds flexibility, it may introduce hard-to-predict behavior, depending on precedence of section specifiers, the order of your config file, and other features, such as include files. Instead of using '+=' try picking different make variables for different purposes. For an example of hard-to-predict behavior, what will happen if you add --label=breakfast to the above runcpu command? (Try it.)

Avoid += to prevent surprises.
Keep it simple.
Pick different makevars for different purposes.
Create conventions for your config files and write them down, in config file comments.
If you must use += review its effects carefully (--fake is your friend).

III.E Using action=buildsetup to create a sandbox

When debugging a set of build options, it is often useful to create a "sandbox" - that is, a directory where you can play with the benchmark and its options. This example creates a build sandbox with action buildsetup.

$ cat sandbox.cfg 
action        = buildsetup
label         = fast
output_root   = /tmp/demo_buildsetup
runlist       = 519.lbm_r
tune          = peak
   OPTIMIZE   = --fast
$ cat sandbox.sh 
runcpu --config=sandbox | grep log
grep Makefile.spec /tmp/demo_buildsetup/result/CPU2017.001.log
$ ./sandbox.sh 
The log for this run is in /tmp/demo_buildsetup/result/CPU2017.001.log
Wrote to makefile '/tmp/demo_buildsetup/benchspec/CPU/519.lbm_r/build/build_peak_fast.0000/Makefile.spec':

The action causes a directory to be created
There, the Makefile can be examined, used in a dry run, or modified as part of a testing effort.

$ cd /tmp/demo_buildsetup/benchspec/CPU/519.lbm_r/build/build_peak_fast.0000/
$ grep OPTIMIZE Makefile.spec 
OPTIMIZE         = --fast
$ specmake --dry-run
cc -c -o main.o -DSPEC -DSPEC_CPU -DNDEBUG -DSPEC_AUTO_SUPPRESS_OPENMP  --fast  main.c
cc --fast lbm.o main.o -lm  -o lbm_r
$  (Notes about examples) 

See also the chapter on specmake in SPEC CPU2017 Utilities and the sandbox examples in Avoiding runcpu.

III.F. About Automatic Rebuilds

The SPEC CPU tools try to keep config files and binaries synchronized with each other.(*)
Edits to a config file may cause binaries to be rebuilt, sometimes to the surprise(**) of testers.

Testing option sensitivity: The first thing that happens in a rebuild is to delete the old binary.
If that is a potential problem (perhaps it takes a long time to build), you can test whether a config file change will cause a rebuild:

  1. Copy the config file to a test version (say, mycopy.cfg)
  2. Edit the copy.
  3. Remove any lines that change defaults for ignore_errors or verify_binaries
  4. Use: runcpu --config=mycopy --nobuild --size=test --iterations=1 ...

The command:

No binaries are harmed.


(*) Unless you change verify_binaries. Recommendation: Don't change it.

(**) Recent implementations surprise less often. For detail, see the CPU2006 version of this section.

IV. Config file options for the shell

Some options in your config file cause commands to be executed by your shell (/bin/sh) or by the Windows command interpreter (cmd.exe).

IV.A. \$SHELLVAR variable substitution

Substitution by the shell - or by the windows command interpreter - uses backslash dollar sign.
The backslash protects the variables from interpretation by runcpu.

Example: This config file runs 519.lbm_r twice, with base and peak options. Only peak uses backslashes:

$ cat tmp.cfg
expand_notes         = 1
iterations           = 1
runlist              = 519.lbm_r
size                 = test
tune                 = base,peak
   CC                = gcc
   submit    = echo home=$HOME, spec=$SPEC   > /tmp/chan; ${command}
   submit    = echo home=\$HOME, spec=\$SPEC > /tmp/nui;  ${command}
$ runcpu --config=tmp | grep txt 
    format: Text -> /Users/chris/spec/cpu2017/result/CPU2017.697.fprate.test.txt
$ cd /tmp
$ cat chan nui
home=, spec=
home=/Users/chris, spec=/Users/chris/spec/cpu2017
$ (Notes about examples)   

In base, $HOME and $SPEC are gobbled up by runcpu, which obediently retuns their values: nothing at all.
In peak, backslashes prevent runcpu from interpreting the variables, and the shell provides their values.

IV.B. Shell Options

Warning: SPEC CPU config files can execute arbitrary shell commands.
Read a config file before using it.
Don't be root. Don't run as Administrator. Turn privileges off.

These options cause commands to be executed:


Command to be executed after each benchmark's run directory setup phase. The rules say that this feature may be used to cause data to be written to stable storage (e.g. sync). The command must be the same for all benchmarks. It will be run after each benchmark is setup, and for all workloads (test/train/ref). It is not affected by the setting of parallel_test.

Use the header section for bench_post_setup.

Commands for benchmark monitoring, described in the document on the Monitoring Facility.
fdo_pre0 Commands to execute before starting an FDO series.

The fdo_ options let you use a wide variety of Feedback Directed Optimization (FDO) models, including compiler-based and non-compiler-based instrumentation and optimization using single and multi-pass builds. See examples in the section on Using Feedback

Your changes must comply with the rules, such as the requirement to use the SPEC-supplied training data. If in doubt, you may write to SPEC.

fdo_preN Commands to be executed before pass N.
fdo_make_clean_passN Commands to be executed for cleanup at pass N.
fdo_pre_makeN Commands to be done prior to Nth compile.
fdo_make_passN Commands to actually do the Nth compile.
fdo_post_makeN Commands to be done after the Nth compile.
fdo_runN Commands to be used for Nth training run.
fdo_postN Commands to be done at the end of pass N.
monitor_X Commands that allow benchmark monitoring, described in the document on the Monitoring Facility.

Command to be executed after all benchmark run directories have been set up. The rules say that this feature may be used to cause data to be written to stable storage (e.g. sync).


  1. This option may only be specified in the header section.
  2. The post_setup option is not executed if minimize_rundirs is set.
  3. If parallel_test is greater than 1, post_setup is not executed for test and train.

Modified command to actually run a benchmark. The default is in ${command}, which the rules allow you to supplement, by sending it to the desired location, such as a particular processor.

Several features are typically used in conjunction with submit:

  • $SPECCOPYNUM - job number
  • $BIND - processor number (or other identifier for destination).
  • command_add_redirect - determines whether redirection operators (such as <this_benchmark.in or >that_benchmark.out) are applied to your entire modified submit command (the default) or just to the portion that has ${command}.
    In pseudo-code:

    A bind list has processor numbers n; a benchmark runs exe; it sends output to out.
    If you use:
    submit = send_job job="${command}" cpu_id=$BIND
       Do you prefer this?
             send_job job="exe >out" cpu_id=n
    Or this?
             send_job job="exe" cpu_id=n >out

    The above is only pseudo-code; see the section on Using Submit for real examples.

IV.C. Using submit

The config file feature submit allows you to distribute jobs across a multiprocessor system. This section provides examples to demonstrate how submit works with several other config file features. You might also want to search published results at www.spec.org/cpu2017 for systems that are similar to your system.

IV.C.1. Basic usage of submit

You can use your operating system's facilities that assign jobs to processors, such as dplace, pbind, procbind, prun, start/affinity, or taskset together with ${command} and $SPECCOPYNUM.

Submit Example 1: ${command} and $SPECCOPYNUM

The example below runs 4 copies, sending each one to a different processor.

$ cat taskset.cfg 
copies   = 4
runlist  = 519.lbm_r
submit   = taskset -c $SPECCOPYNUM ${command}
$ cat taskset.sh
runcpu --fake --config=taskset | grep '^taskset' | cut -b 1-75
$ ./taskset.sh 
taskset -c 0 ../run_base_refrate_none.0000/lbm_r_base.none 3000 reference.d
taskset -c 1 ../run_base_refrate_none.0000/lbm_r_base.none 3000 reference.d
taskset -c 2 ../run_base_refrate_none.0000/lbm_r_base.none 3000 reference.d
taskset -c 3 ../run_base_refrate_none.0000/lbm_r_base.none 3000 reference.d
$  (Notes about examples) 

Notice that $SPECCOPYNUM acquires the values 0, 1, 2, 3 in the generated commands, thereby using a different taskset assignment for each.

A problem with the above example: you might not want to send copy #2 to processor #2 and copy #3 to processor #3.
Perhaps you have a system with processors that do not have contiguous ID numbers.
Perhaps you want to spread the work out across a system, or you want to alternate jobs in ping-pong fashion.
You can customize the destinations with a bind list.

Submit Example 2: Using a bind list

A system has 512 virtual processors, 64 chips, 4 cores per chip, 2 threads per core.
We would like to run one copy per chip.
This system does processor binding using pbind -b processor -e command
The bind statement on lines 1 through 5 specifies one processor id from each chip.
Line 11 plugs them into $BIND

$ cat -n pbind.cfg
     1  bind = \
     2       0   8  16  24  32  40  48  56  64  72  80  88  96 104 112 120 \
     3     128 136 144 152 160 168 176 184 192 200 208 216 224 232 240 248 \
     4     256 264 272 280 288 296 304 312 320 328 336 344 352 360 368 376 \
     5     384 392 400 408 416 424 432 440 448 456 464 472 480 488 496 504
     6  copies               = 64           
     7  iterations           = 1         
     8  output_root          = /tmp/pbind
     9  runlist              = 519.lbm_r
    10  size                 = test 
    11  submit               = pbind -b $BIND -e ${command}
    12  verbose              = 40
    13  default:
    14     CC                = cc
    15     CC_VERSION_OPTION = -V
$ cat pbind.sh
runcpu --config=pbind > /dev/null &  # put runcpu in background
sleep 15                             # let things get started
pbind -q > /tmp/pbind/bound          # query bindings
$ ./pbind.sh
$ (Notes about examples) 

The pbind.sh script starts the run, waits 15 seconds, and then checks the status with pbind -q.
Below, we verify that all 64 copies were bound: /tmp/pbind/bound has 64 lines. A few are shown.

$ wc -l /tmp/pbind/bound
      64 /tmp/pbind/bound
$ sort -nk8 /tmp/pbind/bound | head -6
pbind(1M): pid 22673 strongly bound to processor(s) 0.
pbind(1M): pid 22675 strongly bound to processor(s) 8.
pbind(1M): pid 22682 strongly bound to processor(s) 16.
pbind(1M): pid 22693 strongly bound to processor(s) 24.
pbind(1M): pid 22689 strongly bound to processor(s) 32.
pbind(1M): pid 22685 strongly bound to processor(s) 40.

IV.C.2. Script generation by submit

Although Example 2 sent 64 copies where we want them, we only know that because a separate process happened to be watching.
That's not good enough. It would be much better to always leave confirmation that submit does what is intended.
To do so, generate a small script for each copy.

Submit Example 3: Generating scripts and command_add_redirect

This example uses echo to create a script called dobmk for each benchmark copy.
Line 12 writes a processor binding command to the script.
Line 13 appends the ${command} that executes the benchmark.
Line 14 actually runs it.

$ cat -n scriptGen.cfg 
     1  bind = \
     2       0   8  16  24  32  40  48  56  64  72  80  88  96 104 112 120 \
     3     128 136 144 152 160 168 176 184 192 200 208 216 224 232 240 248 \
     4     256 264 272 280 288 296 304 312 320 328 336 344 352 360 368 376 \
     5     384 392 400 408 416 424 432 440 448 456 464 472 480 488 496 504
     6  command_add_redirect = yes     
     7  copies               = 64           
     8  iterations           = 1         
     9  output_root          = /tmp/pbind
    10  runlist              = 519.lbm_r
    11  size                 = test 
    12  submit0              = echo 'pbind -b $BIND \$\$ >> pbind.out' > dobmk
    13  submit2              = echo "${command}" >> dobmk
    14  submit4              = sh dobmk
    15  default:
    16     CC                = cc
    17     CC_VERSION_OPTION = -V
$ cat scriptGen.sh
runcpu --config=scriptGen | grep copies
$ ./scriptGen.sh 
  Setting up 519.lbm_r test base none (64 copies): run_base_test_none.0000-0063
  Running 519.lbm_r test base none (64 copies) [2017-02-10 16:05:16]
$ (Notes about examples) 

A generated dobmk is below. Copy #42 gets bound to processor id 336, which is #42 in the bind list.
Notice that dobmk includes redirection operators, such as >lbm.out
The operators are present because config file line 6 sets command_add_redirect.

$ cd /tmp/pbind/benchspec/CPU/519.lbm_r/run
$ cat run_base_test_none.0042/dobmk 
pbind -b 336 $$ >> pbind.out
../run_base_test_none.0000/lbm_r_base.none 20 reference.dat 0 1 
             100_100_130_cf_a.of > lbm.out 2>> lbm.err
$  [line wrap added for readability] 

All copies run the same benchmark binary, namely the one in directory 0000.
The name of the executable is lbm_r_base.none because label defaults to none.

Lastly, we can verify that the pbind command worked by looking at pbind.out:

$ cat run_base_test_none.0042/pbind.out 
pbind(1M): pid 24422 strongly bound to processor(s) 336.  

The next example generates scripts for numactl.

Submit Example 4: Generating a script for numactl

On line 9, the config file creates a runcpu variable called numactlShow which contains within it the command to demonstrate processor assignment. It is written to dobmk along with the command to actually run the benchmark on lines 10-11. On line 12, dobmk is invoked.

$ cat -n numactl.cfg
     1  iterations              = 1
     2  output_root             = /tmp/numactl
     3  size                    = test
     4  tune                    = base
     5  verbose                 = 40
     6  default: # --------- submit stuff ----------------------------------------
     7     bind                 = 2,  3,  5,  7, 11, 13, 17, 19, 23, 29, 31
     8     command_add_redirect = yes
     9     numactlShow          = numactl --show | grep phys >> numactl-s.out 2>&1
    10     submit0              = echo "$[numactlShow]"        > dobmk
    11     submit3              = echo "${command}"           >> dobmk
    12     submit5              = numactl --physcpubind=$BIND sh dobmk
    13  default: #---------- compiler stuff --------------------------------------
    14     CC                   = gcc
    15     CC_VERSION_OPTION    = -v
    16     OPTIMIZE             = -O3
    17  intrate: #---------- suite stuff -----------------------------------------
    18     copies               = 11
$ cat numactl.sh
runcpu --config=numactl --fake 557.xz_r | grep "The log for this run"
runcpu --config=numactl        557.xz_r | grep txt
$ ./numactl.sh
The log for this run is in /tmp/numactl/result/CPU2017.001.log
    format: Text -> /tmp/numactl/result/CPU2017.002.intrate.test.txt

When the compression program 557.xz_r runs its test workload, the benchmark binary is actually run 11 times, as it tests various types of compression. We can see the evidence of that if we look at a sample run directory. Notice that all 11 invocations of copy #9 used processor #9 from the bind list on line 7 of the config file.

$ cd /tmp/numactl/benchspec/CPU/557.xz_r/run
$ cd run_base_test_none.0009
$ cat numactl-s.out
physcpubind: 29
physcpubind: 29
physcpubind: 29
physcpubind: 29
physcpubind: 29
physcpubind: 29
physcpubind: 29
physcpubind: 29
physcpubind: 29
physcpubind: 29
physcpubind: 29
physcpubind: 29
$  (Notes about examples) 

IV.C.3. Quoting and traps for the unwary

Submit Example 3 uses three types of quoting: backslashes, single quotes, and double quotes.

submit0 = echo 'pbind -b $BIND \$\$ >> pbind.out' > dobmk
submit2 = echo "${command}" >> dobmk
submit4 = sh dobmk

The details of the quoting may seem like more than what you want to know, but can be crucial if you develop or maintain submit options. Caution: here be traps. Quote carefully, then check whether it did what you think it did.

"" Double quotes are used on the submit2 line. They do not prevent interpretation of ${command} because runcpu pays no particular attention to double or single quotes. The command to run lbm_r_base.none is inserted with its arguments and device assignments, and the double quotes are still present when the echo executes, where they protect the device assignments.

For example, here is the echo command from copy #42 (how was this found?) [line wrap added for readability]:

echo "../run_base_test_none.0000/lbm_r_base.none 20 reference.dat 0 
   1 100_100_130_cf_a.of > lbm.out 2>> lbm.err" >> dobmk;

TRAP: If you forget the double quotes, you effectively say:

"Please do an echo.
Send standard output to lbm.out.
Send standard error to lbm.err.
No, wait, send standard output to dobmk."

Your operating system does whatever it does with such an odd request; a likely result is that lbm.out is created by the echo, but it has zero bytes, because lbm_r_base.none is not told to write there.

\$\$ On the submit0 line, the designator for the current process is quoted using backslashes to prevent runcpu from trying to interpret it.
$BIND does not have a backslash, because we do want runcpu to substitute values from the bind list.
TRAP: If the submit0 uses $$ (without backslashes), runcpu substitutes its own process ID, and 64 dobmk scripts fight uselessly about the location of runcpu.

'' The submit0 line also uses single quotes, which are still present when the echo is done.
By then, $BIND has already been substituted, and \$\$ has become $$.
The single quotes prevent $$ substitution by the parent shell that runs echo.

For example, here is what runcpu generated for copy #42 (how was this found?)

echo 'pbind -b 336 $$ >> pbind.out' > dobmk

TRAP: If double quotes are used on submit0 instead of single quotes, an interesting bug happens: the parent shell pid is inserted, dobmk runs, and obediently binds its parent to a processor. It does not bind itself. ("Wait, wait", you say, "Bindings are inherited, so binding the parent should be fine, right?" Well, no. Bindings are inherited only for newly created processes. Binding your parent doesn't do a thing for yourself. So, you definitely want the $$ to become dobmk's PID, not its parent's.)

IV.C.4. Debug tips for submit

  1. Start small. Try to debug your options for a run with, say, 3 copies before trying 2048.

  2. When debugging submit options, increase the log verbose option.

  3. Pick methods that leave evidence behind. If you generate little scripts (like dobmk above), they can be examined.
  4. Capture standard output from your processor assignment command (e.g. pbind.out from line 12 of Example 3). (It would also be a very good idea to capture standard error.)

  5. Use the specinvoke -n dry run option. On the system where Example 3 was run, a grep verifies that there are 64 generated pbind commands. Another grep picks out the one that belongs to copy #42, using head -43. Copy #42 is, of course, the 43rd copy to run, because copy #0 is the first.

    $ pwd
    $ specinvoke -n | grep -c pbind
    $ specinvoke -n | grep pbind | head -43 | tail -1
    echo 'pbind -b 336 $$ >> pbind.out' > dobmk;
    echo "../run_base_test_none.0000/lbm_r_base.none 20 reference.dat 0 
       1 100_100_130_cf_a.of > lbm.out 2>> lbm.err" >> dobmk;
    sh dobmk

    (Because runcpu generates very wide lines, line wraps were inserted for readability.)

IV.C.5. Maintainability suggestions when using submit

Here are two maintainability suggestions for authors of config files that use submit.

To limit line width, you can use any of the three continuation styles discussed at the top of this document.

All three styles are used in this config file. All have the same effect.

Submit Example 5: Line continuation
$ cat -n continued.cfg
     1  command_add_redirect = yes
     2  copies               = 8
     3  iterations           = 1
     4  output_root          = /tmp/submit
     5  runlist              = 505.mcf_r
     6  size                 = test
     7  use_submit_for_speed = yes
     8  default:
     9     CC                = cc
    10     CC_VERSION_OPTION = -V
    11  intrate=base=backslash:
    12     submit0 = echo 'pbind -b $SPECCOPYNUM \$\$ >> pbind.out' > dobmk \
    13               echo "${command}" >> dobmk \
    14               sh dobmk
    15  intrate=base=fieldN:
    16     submit0 = echo 'pbind -b $SPECCOPYNUM \$\$ >> pbind.out' > dobmk
    17     submit2 = echo "${command}" >> dobmk
    18     submit4 = sh dobmk
    19  intrate=base=heredoc:
    20     submit = <<EOT
    21               echo 'pbind -b $SPECCOPYNUM \$\$ >> pbind.out' > dobmk
    22               echo "${command}" >> dobmk
    23               sh dobmk
    24  EOT

To run the config file, the script below uses three runcpu commands with three different --label switches. After the runs finish, it produces a list of dobmk files that were generated, and prints a sample of each type.

$ cat continued.sh
runcpu --config=continued --label=backslash | grep copies
runcpu --config=continued --label=fieldN    | grep copies
runcpu --config=continued --label=heredoc   | grep copies
cd /tmp/submit/benchspec/CPU/505.mcf_r/run
ls *backslash*/dobmk
ls *fieldN*/dobmk
ls *heredoc*/dobmk
for file in *0007/dobmk ; do
   echo ==== $file =====
   cat $file
$ ./continued.sh 
  Setting up 505.mcf_r test base backslash (8 copies): run_base_test_backslash.0000-0007
  Running 505.mcf_r test base backslash (8 copies) [2017-02-11 07:38:32]
  Setting up 505.mcf_r test base fieldN (8 copies): run_base_test_fieldN.0000-0007
  Running 505.mcf_r test base fieldN (8 copies) [2017-02-11 07:40:55]
  Setting up 505.mcf_r test base heredoc (8 copies): run_base_test_heredoc.0000-0007
  Running 505.mcf_r test base heredoc (8 copies) [2017-02-11 07:43:15]

run_base_test_backslash.0000/dobmk  run_base_test_backslash.0004/dobmk
run_base_test_backslash.0001/dobmk  run_base_test_backslash.0005/dobmk
run_base_test_backslash.0002/dobmk  run_base_test_backslash.0006/dobmk
run_base_test_backslash.0003/dobmk  run_base_test_backslash.0007/dobmk
run_base_test_fieldN.0000/dobmk  run_base_test_fieldN.0004/dobmk
run_base_test_fieldN.0001/dobmk  run_base_test_fieldN.0005/dobmk
run_base_test_fieldN.0002/dobmk  run_base_test_fieldN.0006/dobmk
run_base_test_fieldN.0003/dobmk  run_base_test_fieldN.0007/dobmk
run_base_test_heredoc.0000/dobmk  run_base_test_heredoc.0004/dobmk
run_base_test_heredoc.0001/dobmk  run_base_test_heredoc.0005/dobmk
run_base_test_heredoc.0002/dobmk  run_base_test_heredoc.0006/dobmk
run_base_test_heredoc.0003/dobmk  run_base_test_heredoc.0007/dobmk

==== run_base_test_backslash.0007/dobmk =====
pbind -b 7 $$ >> pbind.out
../run_base_test_backslash.0000/mcf_r_base.backslash inp.in  > inp.out 2>> inp.err

==== run_base_test_fieldN.0007/dobmk =====
pbind -b 7 $$ >> pbind.out
../run_base_test_fieldN.0000/mcf_r_base.fieldN inp.in  > inp.out 2>> inp.err

==== run_base_test_heredoc.0007/dobmk =====
pbind -b 7 $$ >> pbind.out
../run_base_test_heredoc.0000/mcf_r_base.heredoc inp.in  > inp.out 2>> inp.err
$ (Notes about examples) 

IV.C.6. Reporting of submit usage

If you use the submit feature, a notes section will automatically be created to indicate that you have done so.

                                 Submit Notes
    The config file option 'submit' was used.

You can add notes to that section, or customize it as you wish, by creating lines with notes_submit_NNN. The phrase The config file option 'submit' was used must appear somewhere in your customized notes. You can vary the capitalization of the phrase, you can even break it across multiple lines; it just needs to be present. If it is not, it will automatically be added.

Submit Example 6: Magical notes_submit

The notes on lines 16-20 appear and disappear automatically depending on whether submit is used.

$ cat -n notes_submit.cfg 
     1  iterations              = 1
     2  output_root             = /tmp/notes_submit
     3  size                    = test
     4  verbose                 = 40
     5  default: # --------- submit stuff ----------------------------------------
     6     bind                 = <<EOT 
     7                              2,  3,  5,  7, 11, 13, 17, 19, 23, 29, 31 
     8                             37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79
     9                             83, 89, 97
    10  EOT
    11     command_add_redirect = yes
    12     numactlShow          = numactl --show | grep phys >> numactl-s.out 2>&1
    13     submit02             = echo '$[numactlShow]'        > dobmk
    14     submit03             = echo '${command}'           >> dobmk
    15     submit05             = numactl --physcpubind=$BIND sh dobmk
    16     notes_submit_007     = 
    17     notes_submit_011     = The config file option
    18     notes_submit_013     = 'submit' was used to prefer 
    19     notes_submit_017     = prime processors.
    20     notes_submit_019     = 
    21  default: #---------- compiler stuff --------------------------------------
    22     CC                   = gcc
    23     CC_VERSION_OPTION    = -v
    24     OPTIMIZE             = -O
    25  intrate: #---------- suite and benchmark stuff ---------------------------
    26     copies               = 21
    27  intspeed: 
    28     use_submit_for_speed = yes
    29  657.xz_s=peak:   
    30     EXTRA_OPTIMIZE       = -DSPEC_OPENMP -fopenmp
    31     use_submit_for_speed = no
    32     threads              = 29

The script below uses this config file for three runcpu commands:

  1. Run a SPECrate Integer benchmark, 557.xz_r. By default, SPECrate uses submit and therefore config file lines 5-20 apply.
  2. Run base for the SPECspeed Integer benchmark 657.xz_s, using submit because of lines 27-28.
  3. Run peak for the SPECspeed Integer benchmark 657.xz_s, skipping submit because of line 31.

After the 3 runcpu commands, various results are printed, as explained below.

$ cat notes_submit.sh 
runcpu --config=notes_submit 557.xz_r --tune=base | grep txt
runcpu --config=notes_submit 657.xz_s --tune=base | grep txt
runcpu --config=notes_submit 657.xz_s --tune=peak | grep txt

grep -C1 "to prefer" /tmp/notes_submit/result/CPU2017.00*txt

cd /tmp/notes_submit/benchspec/CPU/557.xz_r/run/
  num_dirs=$(ls -d run* | wc -l)
 num_dobmk=$(ls run*/dobmk | wc -l)
echo 557.xz_r has $num_dirs run dirs and $num_dobmk dobmk scripts

cd /tmp/notes_submit/benchspec/CPU/657.xz_s/run/
      num_dirs_base=$(ls -d run*base* | wc -l)
      num_dirs_peak=$(ls -d run*peak* | wc -l)
 num_dobmk_base=$(ls run*base*/dobmk | wc -l)
 num_dobmk_peak=$(ls run*peak*/dobmk 2>/dev/null | wc -l)
echo "657.xz_s base has $num_dirs_base run dir(s) and $num_dobmk_base dobmk script(s)"
echo "657.xz_s peak has $num_dirs_peak run dir(s) and $num_dobmk_peak dobmk script(s)" 

Notice that when grep searches for "to prefer" in the generated reports, it finds that phrase from notes_submit in CPU2017.001.intrate.test.txt and in CPU2017.002.intspeed.test.txt, but not in CPU2017.003.intspeed.test.txt:

$ ./notes_submit.sh 
    format: Text -> /tmp/notes_submit/result/CPU2017.001.intrate.test.txt
    format: Text -> /tmp/notes_submit/result/CPU2017.002.intspeed.test.txt
    format: Text -> /tmp/notes_submit/result/CPU2017.003.intspeed.test.txt

/tmp/notes_submit/result/CPU2017.001.intrate.test.txt-     The config file option
/tmp/notes_submit/result/CPU2017.001.intrate.test.txt:     'submit' was used to prefer
/tmp/notes_submit/result/CPU2017.001.intrate.test.txt-     prime processors.
/tmp/notes_submit/result/CPU2017.002.intspeed.test.txt-     The config file option
/tmp/notes_submit/result/CPU2017.002.intspeed.test.txt:     'submit' was used to prefer
/tmp/notes_submit/result/CPU2017.002.intspeed.test.txt-     prime processors.

557.xz_r has 21 run dirs and 21 dobmk scripts

657.xz_s base has 1 run dir(s) and 1 dobmk script(s)
657.xz_s peak has 1 run dir(s) and 0 dobmk script(s)
$  (Notes about examples) 

The final portion of the output shows that the run directories match the notes.
In short, when submit comes and goes, the notes magically do the same.

V. Config file options for readers

Whether or not you send your result to SPEC, you should fully disclose how you achieved the result. If it requires the installation of the GoFastLinker, you should say so. By setting the appropriate fields in the config file, you can cause information about the GoFastLinker to appear in the reports that are intended for humans.

V.A. Descriptive fields

Here are the fields that you can set to describe your testbed to readers:


Customer-orderable name and version of the system firmware (also sometimes called 'BIOS') on the System Under Test (SUT). Most systems have many different pieces of firmware; if these are distributed as one package, document the name and version of that package using this field. Here are some examples that were printed by the dmidecode utility; if you have some other source of information you should use whatever name would most help someone who wants to reproduce your results.

American Megatrends Inc. 1.0 05/13/2016
Cisco Systems, Inc. C460M4. 04/25/2016
Dell Inc. 2.0.1 04/20/2016
HP P89 04/12/2016
Insyde Corp. 3.09 02/22/2016
LENOVO -[TBE123H-2.10]- 03/25/2016
The fw_bios field is new with CPU2017.
hw_avail Date hardware first shipped. If more than one date applies, use the LATEST one.
hw_cpu_max_mhz Maximum speed of the CPUs as specified by the chip vendor, in MHz. The hw_cpu_max_mhz field is new with CPU2017
hw_cpu_name Manufacturer-determined formal processor name.
hw_cpu_nominal_mhz Speed of the CPUs as specified by the chip vendor, in MHz. The hw_cpu_nominal_mhz field is new with CPU2017

Disk subsystem for the SPEC run directories. Three important Notes:

  1. Only a single file system may be used for the run directories, as described in the run rules.
  2. If the output_root feature is used, its disk subsystem should be the one described here.
  3. Disks used for other purposes - such as the system disk - are usually not a performance-relevant variable, so need not be described. Note that if you wish to do so, you can extend this field using the same continuation rules as other fields (so, you could use hw_disk01, hw_disk02, etc).
hw_memory Size of main memory (and other performance-relevant information about memory, as discussed in the run rules.)
hw_model Model name.
hw_nchips Number of CPU chips configured. See the discussion of CPU counting in the run rules.
hw_ncores Number of CPU cores configured. See the discussion of CPU counting in the run rules.
hw_ncpuorder Valid number of processors orderable for this model, including a unit. For example, "2, 4, 6, or 8 chips".
hw_nthreadspercore Number of hardware threads per core. See the discussion of CPU counting in the run rules.
hw_other Any other performance-relevant hardware.
hw_pcache 1st level (primary) cache.
hw_power_{id}_cal_date The date the power meter was last calibrated. All power-related fields are new with CPU2017
hw_power_{id}_cal_label The calibration label.
hw_power_{id}_cal_org The name of the organization or institute that did the calibration.
hw_power_{id}_met_inst The name of the metrology institute that certified the organization that did the calibration of the meter.
hw_power_{id}_connection Description of the interface used to connect the power analyzer to the PTDaemon host system, e.g. RS-232 (serial port), USC, GPIB, etc.
hw_power_{id}_label A tag to reference a power analyzer. If you have more than one power analyzer, give them different {id}'s and they will be reported in separate sections. This label will be printed in the section.
hw_power_{id}_model The model name of the power analyzer used for this benchmark run.
hw_power_{id}_serial The serial number uniquely identifying the power analyzer.
hw_power_{id}_setup A brief description of which devices were measured by this device.
hw_power_{id}_vendor Company which manufactures and/or sells the power analyzer.
hw_psu The number and ratings (in Watts) of the systems power supplies.
hw_psu_info Details about the power supplies, like vendor part number, manufacturer, etc.
hw_scache 2nd level cache.
hw_tcache 3rd level cache.
hw_ocache 4th level or other form of cache.
hw_vendor The hardware vendor. An example of usage of this and related fields is given in the test_sponsor section.
license_num The SPEC license number for either the tester or the test_sponsor.
prepared_by Is never output. If you wish, you could set this to your own name, so that the rawfile will be tagged with your name but not the formal reports.
sw_avail Availability date for the software used. If more than one date, use the LATEST one.
sw_base_ptrsize Size of pointers in base. Report:
  • "32-bit" if all benchmarks in base are compiled with switches that request only 32-bit pointers.
  • "64-bit" if all benchmarks in base are compiled with switches that request only 64-bit pointers.
  • "32/64-bit" if there is a mix of 32-bit and 64-bit. For example, if the C++ benchmarks use 32-bit pointers, while the Fortran and C benchmarks use 64-bit, then you would report "32/64-bit" here.
sw_compiler Name and version of compiler. Note that if more than one compiler is used, you can employ continuation lines, as with most other descriptive fields

File system (ntfs, ufs, nfs, etc) for the SPEC run directories. Three important Notes:

  1. Only a single file system may be used for the run directories, as described in the run rules.
  2. If the output_root feature is used, its file system should be the one described here.
  3. File systems used for other purposes - such as for the operating system - are usually not a performance-relevant variable, so need not be described. You can extend this field using the same continuation methods as other fields (e.g. you could use sw_file01, sw_file02, etc).
sw_os Operating system name and version.
sw_other Any other performance-relevant non-compiler software used, including third-party libraries, accelerators, etc.
sw_peak_ptrsize Size of pointers in peak. Report:
  • "32-bit" if all benchmarks in peak are compiled with switches that request only 32-bit pointers.
  • "64-bit" if all benchmarks in peak are compiled with switches that request only 64-bit pointers.
  • "32/64-bit" if there is a mix of 32-bit and 64-bit.
sw_state Multi-user, single-user, default, etc.
tester The entity actually carrying out the tests. An optional field; if not specified, defaults to test_sponsor. An example is given in the test_sponsor section.

The entity sponsoring this test. An optional field; if not specified, defaults to hw_vendor.

For example, suppose that the Genius Compiler Company wants to show off their new compiler on the TurboBlaster 9000 computer, but does not happen to own a maxed-out system with eight thousand processors. Meanwhile, the Pawtuckaway State College Engineering department has just taken delivery of such a system. In this case, the compiler company could contract with the college to test their compiler on the big machine. The fields could be set as:

      test_sponsor = Genius Compilers
      tester       = Pawtuckaway State College
      hw_vendor    = TurboBlaster

V.B. Fields can vary by scope.

Fields can appear and disappear based upon scope. For example, if your floating point runs used two Fortran compilers (which is allowed for peak), you could construct a config file that adjusts the fields accordingly:

$ cat tmp.cfg
expand_notes      = 1
hw_vendor         = Turboblaster, Inc.
output_format     = text
output_root       = /tmp/fake
   hw_avail       = Feb-2018
   sw_avail       = Jan-2018
   sw_compiler1   = C: V42.0 of TurboBlaster C/C++
   sw_compiler2   = Fortran: V42.2 of TurboBlaster Fortran 
   sw_avail       = Apr-2018
   sw_compiler3   = Fortran: V7.3.0 of gfortran
   notes_comp_100 = In base, all benchmarks use Turboblaster Fortran
   notes_comp_110 = In peak, some benchmarks use Turboblaster Fortran and some
   notes_comp_120 =          benchmarks use gfortran, as noted in the report.
$ runcpu --fakereportable --config=tmp intrate fprate | grep rate.txt
    format: Text -> /tmp/fake/result/CPU2017.001.intrate.txt
    format: Text -> /tmp/fake/result/CPU2017.001.fprate.txt
$ cd /tmp/fake/result
$ grep avail CPU2017.001.intrate.txt 
    Test sponsor: Turboblaster, Inc.             Hardware availability: Feb-2018
    Tested by:    Turboblaster, Inc.             Software availability: Jan-2018
$ grep avail CPU2017.001.fprate.txt 
    Test sponsor: Turboblaster, Inc.             Hardware availability: Feb-2018
    Tested by:    Turboblaster, Inc.             Software availability: Apr-2018
$ grep ortran CPU2017.001.intrate.txt 
                      Fortran: V42.2 of TurboBlaster Fortran
$ grep ortran CPU2017.001.fprate.txt 
                      Fortran: V42.2 of TurboBlaster Fortran
                      Fortran: V7.3.0 of gfortran
     In base, all benchmarks use Turboblaster Fortran
     In peak, some benchmarks use Turboblaster Fortran and some
              benchmarks use gfortran, as noted in the report.
$ (Notes about examples)   

In the above example, notice that both the compiler information and the availability date changed in the report, depending on the metric.

V.C. Additional notes for the reader

In addition to the pre-defined fields, you can write as many notes as you wish. These notes are printed in the report, using a fixed-width font. For example, you can use notes to describe software or hardware information with more detail beyond the predefined fields:

   notes_os_001  = The operating system used service pack 2 plus patches 
   notes_os_002  = 31415, 92653, and 58979.  At installation time, the 
   notes_os_003  = optional "Numa Performance Package" was selected.

V.C.1. Notes sections

There are various notes sections. If there are no notes in a particular section, it is not output, so you don't need to worry about making sure you have something in each section.

The sections, in order of appearance, are as follows:

  1. notes_comp_NNN -- Notes about compiler invocation.
  2. notes_port_NNN -- Notes about portability options.
  3. notes_base_NNN -- Notes about base optimization options.
  4. notes_peak_NNN -- Notes about peak optimization options.
  5. notes_submit_NNN -- Notes about use of the submit option.
  6. notes_os_NNN -- Notes about operating system tuning and changes.
  7. notes_plat_NNN -- Notes about platform tuning and changes.
  8. notes_part_NNN -- Notes about component parts (for kit-built systems).
  9. notes_NNN -- General notes.

Notes about the submit command are described above, with the description of the submit option.

V.C.2. Note numbering

Start your notes with the name of the notes section where you want the note to appear, and then add numbers to define the order of the lines. Within a section, notes are sorted by line number. The NNN above is not intended to indicate that you are restricted to 3 digits; you can use a smaller or larger number of digits as you wish, and you can skip around as you like: for example, ex-BASIC programmers might naturally use line numbers 100, 110, 120... But note that if you say notes_plat782348320742972403 you just might encounter the dreaded (and highly unusual) "out of memory" error, so don't do that.

You can optionally include an underscore just before the number, but beware: if you say both notes_plat_105 and notes_plat105, both are considered to be the same line. The last one mentioned will replace the first, and it will be the only one output.

V.C.3. Additional tags

For all sections you can add an optional additional tag of your choosing before the numbers. Notes will be organized within the tags.

The intent of the feature is that it may allow you to organize your system information in a manner that better suits your own categories for describing it.

For example:

$ cat notes_tags.cfg 
iterations      = 1
output_format   = text
output_root     = /tmp/notes_tags
runlist         = 519.lbm_r
size            = test
   CC                       = gcc
   CC_VERSION_OPTION        = -v 
   notes_part_greeting_011  = ++ how
   notes_part_greeting_20   = ++ you?
   notes_part_greeting_012  = ++ are
   notes_part_aname_1       = ++ Alex,
   notes_part_080           = ++ hi

$ cat notes_tags.sh  
runcpu --config=notes_tags | grep txt
grep '++' /tmp/notes_tags/result/CPU2017.001*txt
$ ./notes_tags.sh
    format: Text -> /tmp/notes_tags/result/CPU2017.001.fprate.test.txt
     ++ hi
     ++ Alex,
     ++ how
     ++ are
     ++ you?
$ (Notes about examples) 

V.C.4. Links in notes sections

You can mention URLs in your notes section, and html reports will correctly render them as hyperlinks. For example:

notes_plat_001 = Additional detail may be found at
notes_plat_002 = http://www.turboblaster.com/servers/big/green/

If you like, you can use descriptive text for the link by preceding it by the word LINK and adding the descriptive text in square brackets:

LINK url AS [descriptive text]

The brackets may be omitted if your descriptive text is a single word, without blanks.

For example:

notes_plat_001 = Additional detail may be found at 
notes_plat_002 = LINK http://www.turboblaster.com/servers/big/green/ AS [TurboBlaster Servers]

When the above appears in an html report, it is rendered as:

Platform Notes

 Additional detail may be found at 
 TurboBlaster Servers

And in a text report, it appears as:

                                Platform Notes
     Additional detail may found at 
     TurboBlaster Servers (http://www.turboblaster.com/servers/big/green/)

Since the text report is not a context in which the reader can click on a link, it is spelled out instead. Note that because the text report spells the link out, the text line is wider than in HTML, PS, and PDF reports. When deciding where to break your notes lines, you'll have to pick whether to plan line widths for text (which may result in thin-looking lines elsewhere) or plan your line widths for HTML/PS/PDF (which may result in lines that fall of the right edge with text). The feature notes_wrap_columns won't help you here, since it is applied before the link is spelled out.

V.D. About Parallel Reporting

If benchmarks are optimized to use multiple threads, cores, and/or chips at run time, this is reported via the phrase:

Parallel: Yes

The value for "Parallel" is derived from the parallel flag attribute for compiler flags that cause binaries to be multi-threaded.

VI. Using Feedback Directed Optimization (FDO)

Feedback Directed Optimization (FDO) is an optimization method that uses multiple steps, typically:

FDO is also sometimes known as PBO, for Profile-Based Optimization.

This section explores FDO controls and provides examples.
The controls are:

  1. PASSnmakevars (summary)  (full list)
  2. fdo shell options.
  3. feedback config file option.
  4. --feedback runcpu option.

All examples in this section use --fake, which is especially recommended when debugging FDO commands. You can send fake's very wordy output to a file to study it, or subset it with commands such as grep or findstr.

VI.A. Minimum required: PASSn or fdo_

To use feedback, you must use either a PASSn make variable, which adds flags to a pre-defined sequence of FDO build steps; or an fdo_ shell option, which lets you modify and add to the sequence of FDO steps.

PASSn: The most common way of using Feedback Directed Optimization is by setting PASSn makevars (summary)  (full list).
The sequence is:

FDO Example 1: build - train - rebuild
$ cat fdoExample1.cfg 
action          = build
runlist         = 549.fotonik3d_r
tune            = peak
   FC           = tbf90
   PASS1_FFLAGS = --CollectFeedback 
   PASS2_FFLAGS = --ApplyFeedback

This example config file use Turboblaster Fortran 90 to build 549.fotonik3d_r specifying PASS1 and PASS2 flags.
A script picks out a few lines from the log file, including Fortran compile commands for source module readline.f90.

$ cat fdoExample1.sh 
runcpu --config=fdoExample1 --fake | grep -e readline.f90 -e Train 
$ ./fdoExample1.sh 
tbf90 -c -o readline.o -I. --CollectFeedback readline.f90
Training 549.fotonik3d_r with the train workload
tbf90 -c -o readline.o -I. --ApplyFeedback readline.f90
$  (Notes about examples) 

fdo_: Feedback Directed Optimization can also be done by setting up up fdo shell commands.
For example:

FDO Example 2: build - train - othertool
$ cat fdoExample2.cfg 
action          = build
runlist         = 549.fotonik3d_r
tune            = peak
   FC           = tbf90
   OPTIMIZE  = -fast -profile:fbdir
   fdo_post1 = /opt/bin/postoptimizer --profile:fbdir

Notice that this config file does not mention PASS1, but does use fdo_post1.
The script below picks out a few lines of interest from the log:

$ cat fdoExample2.sh 
runcpu --config=fdoExample2 --fake \
   | grep -e readline.f90 -e Train -e '^/opt'
$ ./fdoExample2.sh 
tbf90 -c -o readline.o -I. -fast -profile:fbdir readline.f90
Training 549.fotonik3d_r with the train workload
/opt/bin/postoptimizer --profile:fbdir  
$ (Notes about examples) 

Unlike the previous example, the grep command finds only one compile for readline.f90.

VI.B. Flexible build models

You can adjust the fdo commands that are generated. Perhaps the default model is almost correct for your needs, and you just want minor changes. Caution: Before changing an fdo_ option, find its current setting using --fake. Check it in the context of your config file: the commands will vary.

FDO Example 3: Finding fdo commands with --fake

Goal: discover exactly what the SPEC tools are doing during pass 2 cleanup.

$ cat fdoExample3.cfg 
action          = build
output_root     = /tmp/fake
runlist         = 549.fotonik3d_r
tune            = peak
   FC           = tbf90
   PASS1_FFLAGS = --CollectFeedback 
   PASS2_FFLAGS = --ApplyFeedback

The config file sends output to an output_root.
The script below searches the log file in the result/ directory under that root. The command grep -n prints numbered lines matching '%% Fake commands' and the cut command prints out the first 5 words from matching lines.

$ cat fdoExample3.sh 
runcpu --config=fdoExample3 --fake | grep "log for"
grep -n '%% Fake commands' /tmp/fake/result/CPU2017.001.log \
    | cut -f 1-5 -d' ' 
$ ./fdoExample3.sh 
The log for this run is in /tmp/fake/result/CPU2017.001.log
387:%% Fake commands from make.clean
397:%% Fake commands from fdo_make_pass1
438:%% Fake commands from options1
482:%% Fake commands from compiler-version1
496:%% Fake commands from input_generation
611:%% Fake commands from benchmark_run
720:%% Fake commands from compare_run
819:%% Fake commands from fdo_make_clean_pass2
828:%% Fake commands from fdo_make_pass2
869:%% Fake commands from options2
913:%% Fake commands from compiler-version2

Suppose we are specifically interested in what happens during fdo_make_clean_pass2, line 819 in the above list.
To see the details, look just before and just after the matching phrase:

$ cat -n /tmp/fake/result/CPU2017.001.log | head -825 | tail -10 
   817  specmake -n --output-sync fdoclean FDO=PASS2
   819  %% Fake commands from fdo_make_clean_pass2 (specmake -n --output-sync fdoclean FDO=P...):
   820  rm -rf *.o  pscyee.out
   821  find . \( -name \*.o -o -name '*.fppized.f*' -o -name '*.i' -o -name '*.mod' \) -print | xargs rm -rf
   822  rm -rf fotonik3d_r
   823  rm -rf fotonik3d_r.exe
   824  %% End of fake output from fdo_make_clean_pass2 (specmake -n --output-sync fdoclean FDO=P...)
$ (Notes about examples) 

To decipher the above:

Combining PASSn and fdo_ You can combine PASSn and fdo_ while adjusting the feedback commands.
For example, suppose that your compiler does not need to recompile all the modules in step 2; you just want to relink.
Difficulty: by default, build #2 deletes all the object files.

FDO Example 4: build - train - partial rebuild
$ cat -n fdoExample4.cfg 
     1  action                   = build
     2  runlist                  = 549.fotonik3d_r
     3  tune                     = peak
     4  default=peak:
     5     FC                   = pgf90
     6     PASS1_LDFLAGS        = -PGINSTRUMENT -incremental:no
     7     PASS2_LDFLAGS        = -PGOPTIMIZE   -incremental:no
     8     fdo_make_clean_pass2 = rm ${baseexe}
     9     fdo_make_pass2       = specmake build FDO=PASS2

On line 8, the default cleaning action (from Example 3) is changed to remove only the actual executable. Under the usual rules of GNU make, you might expect that would be sufficient to cause build #2 to do only the link step. In this case, it is not, because by default the tools generate specmake --always-make. Line 9 above overrides that default.

$ cat fdoExample4.sh 
runcpu --config=fdoExample4 --fake \
   | grep -e readline.f90          \
          -e Train                 \
          -e '^rm'                 \
          -e '^specmake.*clean'    \
          -e '^specmake.*build'    \
   | cat -n

The runcpu command above is sent to grep to pick lines of interest, which are numbered by cat.
In the output, lines 1-5 show a typical clean action; much less is removed on line 10. Compare line 6 to line 11 to see the differences in specmake build options.

$ ./fdoExample4.sh 
     1  specmake -n --output-sync clean
     2  rm -rf *.o  pscyee.out
     3  rm -rf fotonik3d_r
     4  rm -rf fotonik3d_r.exe
     5  rm -rf core
     6  specmake -n --output-sync --always-make build FDO=PASS1
     7  pgf90 -c -o readline.o -I. readline.f90
     8  rm -rf options.tmpout
     9  Training 549.fotonik3d_r with the train workload
    10  rm fotonik3d_r  
    11  specmake build FDO=PASS2  
$  (Notes about examples) 

Adding FDO steps: You can add more steps.

FDO Example 5: build-train-build-train-build-train-postopt + file movement

For example, you could:

$ cat -n fdoExample5.cfg 
     1  action                = build
     2  command_add_redirect  = yes
     3  runlist               = 641.leela_s
     4  tune                  = peak
     5  default=peak:       
     6      profdir           = /tmp/feedback/profiles
     7      big_profile       = ${profdir}/${benchnum}.aggregated.profile
     8      clean_profile     = mkdir -p ${profdir}; rm -f ${big_profile}
     9      append_to_profile = cat this.prof >> ${big_profile}
    10      #
    11      fdo_pre0          = ${clean_profile}
    12      #                
    13      PASS1_OPTIMIZE    = --collect:paths
    14      fdo_run1          = ${command}               # profile program paths
    15      fdo_post1         = ${append_to_profile}
    16      #                
    17      PASS2_OPTIMIZE    = --collect:dcache
    18      fdo_run2          = ${command}               # profile data patterns
    19      fdo_post2         = ${append_to_profile}
    20      #                
    21      fdo_pre_make3     = mv ${big_profile} ./profile.in
    22      PASS3_OPTIMIZE    = --apply:paths,dcache
    23      fdo_post_make3    = postopt --instrument:icache
    24      fdo_run3          = ${command}               # profile icache packing
    25      fdo_post3         = postopt --fixup:icache

The script below picks out lines of interest using grep.

$ cat fdoExample5.sh 
runcpu --fake --config=fdoExample5 \
  | grep -e '^mkdir'               \
         -e FullBoard.cpp          \
         -e Train                  \
         -e this.prof              \
         -e ^mv                    \
         -e '^postopt'             \
  | cat -n

The three instances of PASSn_OPTIMIZE are carried out on lines 2, 5, and 9 below.
Various manipulations of the profile files are shown on lines 1, 4, 7, and 8.

$ ./fdoExample5.sh 
     1  mkdir -p /tmp/feedback/profiles; rm -f /tmp/feedback/profiles/641.aggregated.profile  
     2  CC -c -o FullBoard.o -DSPEC -DSPEC_CPU -DNDEBUG -I.      --collect:paths         FullBoard.cpp
     3  Training 641.leela_s with the train workload
     4  cat this.prof >> /tmp/feedback/profiles/641.aggregated.profile  
     5  CC -c -o FullBoard.o -DSPEC -DSPEC_CPU -DNDEBUG -I.      --collect:dcache         FullBoard.cpp
     6  Training 641.leela_s with the train workload
     7  cat this.prof >> /tmp/feedback/profiles/641.aggregated.profile  
     8  mv /tmp/feedback/profiles/641.aggregated.profile ./profile.in  
     9  CC -c -o FullBoard.o -DSPEC -DSPEC_CPU -DNDEBUG -I.      --apply:paths,dcache         FullBoard.cpp
    10  postopt --instrument:icache  
    11  Training 641.leela_s with the train workload
    12  postopt --fixup:icache  
$ (Notes about examples) 

VI.C. The config file feedback option

If you use PASSn or fdo_ then, by default, FDO is used for peak builds. The config file option feedback provides an additional control, an "on/off" switch that can be applied by the usual rules of precedence.

A common usage model is to enable feedback everywhere, then turn it off selectively:

FDO Example 6: feedback=no
$ cat -n fdoExample6.cfg 
  1  action         = build
  2  label          = miriam
  3  runlist        = 503.bwaves_r,519.lbm_r,549.fotonik3d_r
  4  tune           = base,peak
  5  fprate:
  6     fdo_post1   = /opt/merge_feedback
  7  519.lbm_r:
  8     feedback    = no
$ cat fdoExample6.sh 
runcpu --fake --config=fdoExample6 | grep -e Building -e ^/opt
$ ./fdoExample6.sh 
  Building 503.bwaves_r base miriam: (build_base_miriam.0000) [2017-02-23 22:31:30]
  Building 519.lbm_r base miriam: (build_base_miriam.0000) [2017-02-23 22:31:31]
  Building 549.fotonik3d_r base miriam: (build_base_miriam.0000) [2017-02-23 22:31:32]
  Building 503.bwaves_r peak miriam: (build_peak_miriam.0000) [2017-02-23 22:31:33]
  Building 519.lbm_r peak miriam: (build_peak_miriam.0000) [2017-02-23 22:31:35]
  Building 549.fotonik3d_r peak miriam: (build_peak_miriam.0000) [2017-02-23 22:31:35]
$ (Notes about examples) 

Lines 5-6 turn feedback on for SPECrate floating point benchmarks. Neverthelss, 519.lbm_r peak does not use FDO: line 8 takes priority over line 6 by the usual rules of precedence. Feedback is not used for base, because the tools are aware of the rule that disallows it.

VI.D. runcpu --feedback

If you use the runcpu option --feedback or its opposite, --nofeedback, all peak benchmarks are affected.
New with CPU2017: The command line wins unconditionally over the config file.

The example below uses the same config file as the previous example.

FDO Example 7: runcpu --feedback
$ cat fdoExample7.sh 
echo build with --feedback
runcpu --fake --config=fdoExample6 --feedback | grep -e Building -e '^/opt'
echo build with --nofeedback
runcpu --fake --config=fdoExample6 --nofeedback | grep -e Building -e '^/opt'
$ ./fdoExample7.sh 
build with --feedback
  Building 503.bwaves_r base miriam: (build_base_miriam.0000) [2017-02-23 22:37:07]
  Building 519.lbm_r base miriam: (build_base_miriam.0000) [2017-02-23 22:37:08]
  Building 549.fotonik3d_r base miriam: (build_base_miriam.0000) [2017-02-23 22:37:09]
  Building 503.bwaves_r peak miriam: (build_peak_miriam.0000) [2017-02-23 22:37:10]
  Building 519.lbm_r peak miriam: (build_peak_miriam.0000) [2017-02-23 22:37:11]
  Building 549.fotonik3d_r peak miriam: (build_peak_miriam.0000) [2017-02-23 22:37:12]

build with --nofeedback
  Building 503.bwaves_r base miriam: (build_base_miriam.0000) [2017-02-23 22:37:16]
  Building 519.lbm_r base miriam: (build_base_miriam.0000) [2017-02-23 22:37:17]
  Building 549.fotonik3d_r base miriam: (build_base_miriam.0000) [2017-02-23 22:37:18]
  Building 503.bwaves_r peak miriam: (build_peak_miriam.0000) [2017-02-23 22:37:19]
  Building 519.lbm_r peak miriam: (build_peak_miriam.0000) [2017-02-23 22:37:20]
  Building 549.fotonik3d_r peak miriam: (build_peak_miriam.0000) [2017-02-23 22:37:20]
$ (Notes about examples) 

In the run with --feedback, all peak benchmarks use feedback. Base does not, because the tools are aware of the rule.
The run with --nofeedback does not use FDO for any benchmarks.

VII. The config file preprocessor

VII.A. Introduction

The SPEC CPU tools include a configuration file macro preprocessor. This section introduces preprocessor examples, and then discusses a few basics of syntax.

VII.A.1. Preprocessor Example: Picking flags

In this example, the user is testing several configurations, and wants to pick portability flags appropriately.

$ cat -n preprocessor_example1.cfg
     1  500.perlbench_r,600.perlbench_s:  
     2  %if %{chip} eq "sparc"  
     4  %elif %{chip} eq "x86"
     5  %   if %{bits} == 32
     6  %      define suffix IA32
     7  %   else
     8  %      define suffix X64
     9  %   endif
    10      PORTABILITY    = -DSPEC_SOLARIS_%{suffix}
    11  %endif

Lines that start with percent (%) are preprocessor directives. The percent (%) character must be in column 1.

The config file preprocessor is called a macro processor because it allows you to define macros, which are brief abbreviations for longer constructs. In the example above, three macros are used: (%{chip}, %{bits}, and %{suffix}). Macro values are tested, and hunks of config file are included or excluded, using the conditionals %if (lines 2 and 5), and %elif (line 4). One of the conditionals includes an %else clause (line 7).

The preprocessor is automatically run whenever you use runcpu, and macros may be set on the runcpu command line. Or, you can run it separately, as configpp:

$ configpp -c preprocessor_example1.cfg --define chip=sparc | grep -e perlbench -e PORT
$ configpp -c preprocessor_example1.cfg --define chip=x86 | grep -e perlbench -e PORT
$ configpp -c preprocessor_example1.cfg -S chip=x86 -S bits=32 | grep -e perlbench -e PORT

Notice above that preprocessor variables can be set on the command line, using either --define or '-S'.

VII.A.2. Preprocessor Example: Picking CPUs

The preprocessor can help you assign jobs to processors.
In the example below, Hidekatsu would like to pick the number of copies, or the number of threads, based on how many cores are available:

$ cat hidekatsu.cfg 
include: TurboBlaster9000.inc  

   copies      = %{Use1PerCore}
   copies      = %{Use2PerCore}

   threads     = %{Use3PerCore}
   threads     = %{Use2PerCore}

Notice that the above config file begins by including TurboBlaster9000.inc, which is shown next:

$ cat -n TurboBlaster9000.inc 
  1  # TurboBlaster 9000 has: 
  2  #   1 to 4    chips per system
  3  #   2         cores per chip 
  4  #   3         hardware threads (virtual CPUs) per core
  6  %ifndef %{chips}
  7  %   error Please say runcpu --define chip=N
  8  %endif
 10  %define   cores        2 * %{chips} 
 12  %define   Use1PerCore  1 * %{cores}
 13  %define   Use2PerCore  2 * %{cores}
 14  %define   Use3PerCore  3 * %{cores}
 16  # Use 'bind' to place jobs on alternating chips 
 17  # and alternating cores.
 19         # core 0           core 1 
 20  %if %{chips} == 1   
 21     bind=\
 22           0                3          \
 23           1                4          \
 24           2                5         
 25  %elif %{chips} == 2     
 26     bind=\
 27           0  6             3  9       \
 28           1  7             4 10       \
 29           2  8             5 11      
 30  %elif %{chips} == 3   
 31     bind=\
 32           0  6 12          3  9 15    \
 33           1  7 13          4 10 16    \
 34           2  8 14          5 11 17   
 35  %elif %{chips} == 4   
 36     bind=\
 37           0  6 12 18       3  9 15 21 \
 38           1  7 13 19       4 10 16 22 \
 39           2  8 14 20       5 11 17 23
 40  %endif
Lines 1-4: Comments describe the (fictional) system.
Lines 6-8: We expect to be told the number of chips; if not, exit with an error.
Line 10: Calculate how many cores are available.
Lines 12-14: Above, benchmarks made requests for numbers of copies or threads, such as %{Use2PerCore}. Translate such requests into system totals.
Lines 16-40: A bind statement is constructed to spread the work. (We assume here that virtual CPUs are numbered sequentially: chip 0 has CPUs 0 to 5, its first core has 0,1,2; and so forth.)

Lines 27-29: For example, a system with 2 chips has 4 cores and 12 virtual CPUs. Copies are dispatched such that:

  • A 2-copy benchmark uses one virtual CPU per chip (0 and 6)
  • A 4-copy benchmark uses one CPU per core and 2 CPUs per chip. (0, 6, 3, 9)

If you would like to see the preprocessed config file without actually running it, you can use 'configpp', as shown below. As expected, the number of copies or threads varies with the number of chips.

$ cat hidekatsu.cfg 
include: TurboBlaster9000.inc  

   copies      = %{Use1PerCore}
   copies      = %{Use2PerCore}

   threads     = %{Use3PerCore}
   threads     = %{Use2PerCore}
$ configpp -c hidekatsu \
  --output=one.out \
  --define chips=1


$ tail -17 one.out.cfg 
    0      3   \
    1      4   \
    2      5

   copies   = 2
   copies   = 4

   threads  = 6
   threads  = 4
$ configpp -c hidekatsu \
  --output=two.out \
  --define chips=2


$ tail -17 two.out.cfg 
    0  6    3  9  \
    1  7    4 10  \
    2  8    5 11

   copies   = 4
   copies   = 8

   threads  = 12
   threads  = 8
$ configpp -c hidekatsu \
  --output=four.out \
  --define chips=4


$ tail -17 four.out.cfg 
   0  6 12 18   3  9 15 21 \
   1  7 13 19   4 10 16 22 \
   2  8 14 20   5 11 17 23

   copies   = 8
   copies   = 16

   threads  = 24
   threads  = 16

VII.A.3. Preprocessor Syntax Basics

Emphasis: Column 1. Always punch column 1.

All preprocessor directives begin with the percent sign (%) in column 1. You can -- and probably should -- use indenting, but keep the percent sign on the far left. Any amount of spaces or tabs may separate the percent from the directive. The following are okay:

%define foo
%    define bar
%		undef hello!

The following are not okay:

 %define foo           Space in the first column
	%define foo    Tab in the first column
#define foo            <--wrong!

Did you catch why the last one is wrong? This isn't CPP - use percent, not hash!

One line per directive

The macro preprocessor does NOT follow the same quoting rules described elsewhere for config files. In particular, you may not use line continuation, line appending, or block quotes. You may have a value of arbitrary length, but in the interests of config file readability and maintainability, please keep them relatively short.

VII.B. Defining macros

Macro names may only be composed of alphanumeric characters, underscores, and hyphens, and they ARE case-sensitive.

Macros can be defined in a config file or on the command line. Both ways are equivalent.

Macros set on the command-line are defined first; therefore, it is not possible to use the command line to override a macro definition that occurs in the config file itself. If you want the command line to "win", see the notes below about redefinition.

A macro that has not been defined will not be substituted.

A macro that has been defined, but not given a value,

See the example in the next section.

Having no value is not the same as having an empty string as the value. The following are NOT equivalent:

--define FOO="" FOO is defined, is false, and behaves as an empty string in a string context.
--define FOO FOO is defined, is true,  and behaves as an empty string in a string context.

See the table about Truth and definition, below.

VII.B.1. In a config file

To define a macro in a config file, use the %define preprocessor directive. Note that no quoting is necessary when specifying the names of macros or their values.

# Define a simple macro
%define foo bar
# Now the macro called 'foo' has the value 'bar'

# It's not necessary for a macro to have a value to be useful
%define baz
# Now the macro called 'baz' is defined, but it has no value.

Above, the macro baz has been defined, but has not been assigned a value.
Nevertheless, if you use it in an expression, it will behave as if it has the value 1.

$ cat macroDefinedButNoValue.cfg 
%define baz
%ifdef %{baz}
%   info baz is defined. In text contexts, it looks like this: "%{baz}".
%if %{baz} 
%   info In a logical context, it behaves as if it were 1 (i.e. true)
%if %{baz} + 3  == 4
%   info In a numeric context, it behaves as if it were 1.
$ cat macroDefinedButNoValue.sh  
configpp --config=macroDefinedButNoValue | grep INFO
$ ./macroDefinedButNoValue.sh 
  INFO: baz is defined. In text contexts, it looks like this: "".
  INFO: In a logical context, it behaves as if it were 1 (i.e. true)
  INFO: In a numeric context, it behaves as if it were 1.
$  (Notes about examples) 

VII.B.2. On the command line

Macros can also be defined on the command line, using the --define switch. You may find it convenient to abbreviate that switch, if you prefer. The following are entirely equivalent:

$ runcpu --define mymacro=something
$ runcpu -S mymacro=something

VII.B.3. Predefined macros endian, hostname... %{ENV_variable_name}

Some useful macros are predefined, as shown in the table below. New with CPU2017: The config file preprocessor can do environment variable substitution. Environment variables to be substituted use a percent sign and curly brackets, and the name is prefixed with the string "ENV_", as %{ENV_variable_name}.

%{configfile} The name of the config file specified on the command line (possibly not including directory path)
%{endian} Indicator of endian characteristics, for example '12345678' indicating a little-endian system
%{hostname} System Under Test name
%{runcpu} Your original runcpu command
%{top_config_file} The name of the top-level config file (likely the same as %{configfile}, one may have the full path and the other not)
%{current_config_file} The name of the config file currently being read (same as %{configfile} unless reading an included config file)
%{parent_config_file} The name of the config file that included the config file currently being read (empty for the top-level config file)
%{config_nesting_level} The number of nested includes in effect (0 for the top-level config file)

Any of your environment variables.
This simple example prints the PATH from your environment:

$ cat printPath.cfg 
%info %{ENV_PATH}
$ cat printPath.sh 
configpp --config=printPath | grep INFO:
$ ./printPath.sh 
  INFO: /spec/cpu2017/rc4/bin:/usr/local/bin:/bin/:/usr/
$  (Notes about examples) 
For a complete list, see dumpmacros.

Automating output_root: You can automatically bring an output_root into your config file using %{ENV_GO} and navigate around it, as shown in the example for the ogo utility.

VII.B.4. Example: Adjusting the Environment

Notice that %{ENV_variable_name} brings an environment variable into your config file; what if you want to push it back out to affect other things? In that case, you will also need the feature preenv, which causes runcpu to re-invoke itself with your requested variable in the environment.

For example, you could set up paths for your compiler using something like this:

$ cat -n test.cfg
     1  action       = build
     2  output_root  = /tmp
     3  rebuild      = 1
     4  runlist      = 519.lbm_r
     5  verbose      = 99
     7  %define fivetop /SW/compilers/GCC/Linux/x86_64/gcc-5.3.0
     8  %define sixtop  /SW/compilers/GCC/Linux/x86_64/gcc-6.2.0
    10  %ifdef %{wantGccV6}
    11     preENV_PATH            = %{sixtop}/bin:%{ENV_PATH}
    12     preENV_LD_LIBRARY_PATH = %{sixtop}/lib64:%{ENV_LD_LIBRARY_PATH}
    13  % else
    14     preENV_PATH            = %{fivetop}/bin:%{ENV_PATH}
    15     preENV_LD_LIBRARY_PATH = %{fivetop}/lib64:%{ENV_LD_LIBRARY_PATH}
    16  %endif
    18  default:
    19  CC                = gcc
    20  CC_VERSION_OPTION = -v

The config file above builds a single benchmark, 519.lbm_r, using the local paths for GCC 5.3 or 6.2. Notice that in both cases, the desired path is added to the existing path: for example, on line 12, the right side accesses the existing setting via %{ENV_LD_LIBRARY_PATH} and the left side applies it to the build, via preENV_LD_LIBRARY_PATH. On line 5, the config file sets verbose=99, so that runcpu will print all possible detail and we will be able to confirm the adjusted variables.

Below, the config file is used with and without --define wantGccV6, and the paths are set as desired:

$ export LD_LIBRARY_PATH=my:old:path
$ runcpu -c test  | grep -e 'Setting LD_LIBRARY_PATH' -e 'gcc version'
Setting LD_LIBRARY_PATH = "/SW/compilers/GCC/Linux/x86_64/gcc-5.3.0/lib64:my:old:path"
gcc version 5.3.0 (CDS  07-Dec-2015) 
$ runcpu -c test --define wantGccV6  | grep -e 'Setting LD_LIBRARY_PATH' -e 'gcc version'
Setting LD_LIBRARY_PATH = "/SW/compilers/GCC/Linux/x86_64/gcc-6.2.0/lib64:my:old:path"
gcc version 6.2.0 (CDS  22-Aug-2016) 

VII.B.5. Redefinition

You will receive a warning if a previously defined macro is re-defined. In the example below, the first config file just sets a macro; the second one tests before acting.

$ cat preproc.symbol.just_define.cfg 
%define build_ncpus 20
makeflags   = -j%{build_ncpus}
$ cat preproc.symbol.test_first.cfg 
%ifndef %{build_ncpus}
%   define build_ncpus 20
makeflags   = -j%{build_ncpus}

A run script uses both config files with and without a definition on the runcpu command line:

$ cat preproc.symbol.sh 
JUST_DEFINE="configpp --config=preproc.symbol.just_define.cfg"
 TEST_FIRST="configpp --config=preproc.symbol.test_first.cfg"
    GREP_IT='grep -e WARNING -e makeflags'
echo original:
$JUST_DEFINE --define build_ncpus=8 | $GREP_IT
$JUST_DEFINE                        | $GREP_IT
echo test1st: 
$TEST_FIRST  --define build_ncpus=8 | $GREP_IT
$TEST_FIRST                         | $GREP_IT

Result: testing first with ifndef avoids a warning and does a better job of respecting the user's wishes.

$ ./preproc.symbol.sh 
WARNING: Redefinition of preprocessor macro 'build_ncpus' on line 1
makeflags   = -j20
makeflags   = -j20

makeflags   = -j8
makeflags   = -j20
$  (Notes about examples) 

VII.C. Un-doing macro definition

Sometimes you want to make the preprocessor forget about a macro that you taught it. This is easily accomplished.

Macros can be undefined in two ways:

  1. On the command line, using the --undef switch, or
  2. In the configuration file, using the '%undef' preprocessor directive.
    %define foo bar
    # Now the macro called 'foo' has the value 'bar'
    %undef foo
    # Now it doesn't

    Note that no quoting is necessary when specifying the names of macros.

Like macro definition, the undefinition requests can't affect macros set in the config file because those definitions effectively happen after the un-definition. For this reason, command-line undefinition is basically useless; it can only undo macros also set on the command line.
So why was such a useless ability added to the tools? The writer likes orthogonality.

VII.D. Using macros

By now you're probably over that initial euphoric rush that comes from wantonly defining and undefining macros, and you're looking for something more. This is it!

When you want to use the value of a macro, you refer to it by name. Unfortunately, the syntax for this is not as simple as you might hope. It's not too complicated, though; to have the preprocessor expand the macro 'foo', just write


in the place where you'd like it to appear. Given the following config file snippet:

%define foo Hello_
%define bar baz
%define foobar Huh?
%define foobaz What?
%define Hello_baz Please don't do this

Here's a handy table to see the various ways you can reference these values:

Macro reference Value
%{foo} Hello_
%{bar} baz
%{foobar} Huh?
%{foobaz} What?
%{Hello_baz} Please don't do this

Easy, right? The following is also possible:

%{foo%{bar}} What?
%{%{foo}%{bar}} Please don't do this

Because macro values can only be one line long, it's not possible to use the preprocessor to macro-ize large chunks of your config file at once, as may be common practice for advanced users of CPP.

A macro that has not been defined will not be substituted. Note the mention of %{FOO} on the last line of this config file.

$ cat macroDef.cfg 
action          = build
output_format   = text
output_root     = /tmp/mDef
rebuild         = 1
runlist         = 641.leela_s
size            = test
teeout          = 1
   CXX                 = g++
   OPTIMIZE            = %{FOO}

The failure for the first test below is expected: the GNU C++ compiler does not know what to do with ${FOO}.

$ cat macroDef1.sh 
runcpu --config=macroDef | grep -e FullBoard.cpp -e ^Error -e uccess
$ ./macroDef1.sh 
g++ -c -o FullBoard.o -DSPEC -DSPEC_CPU -DNDEBUG -I.  %{FOO}  FullBoard.cpp
Error with make 'specmake --output-sync build': check file 
$ cat /tmp/mDef/benchspec/CPU/641.leela_s/build/build*/make.out
g++ -c -o FullBoard.o -DSPEC -DSPEC_CPU -DNDEBUG -I.  %{FOO}  FullBoard.cpp
g++: error: %{FOO}: No such file or directory

If you want to substitute an empty string, then assign it one.

$ cat macroDef2.sh 
runcpu --config=macroDef --define FOO="" | grep -e FullBoard.cpp -e ^Error -e uccess
$ ./macroDef2.sh 
g++ -c -o FullBoard.o -DSPEC -DSPEC_CPU -DNDEBUG -I.          FullBoard.cpp
Build successes for intspeed: 641.leela_s(base)
$ (Notes about examples) 

VII.E. Conditionals

Defining, undefining, and expanding macros is quite an enjoyable activity in and of itself, and can even be useful on occasion. However, conditionals add an entirely new dimension to config file processing: the ability to include and exclude entire sections of text based on macros and their values.

VII.E.1. %ifdef .. %endif

The %ifdef conditional provides a way to determine whether or not a particular macro has been defined. If the named macro has been defined, the conditional is true, and the text to the matching %endif is included in the text of the config file as evaluated by runcpu. Note that the matching %endif may not necessarily be the next %endif; conditionals may be nested.

For example, given the following section of a config file:

%define foo
%ifdef %{foo}
This text will be included

%ifdef %{bar}
This text will not be included

The preprocessor would produce the following output:

This text will be included

Note especially the quoting used for the macro names in the conditional; the only time macro name quoting may be omitted is when defining or undefining it.

VII.E.2. %ifndef .. %endif

The %ifndef conditional is the converse of %ifdef; If the named macro has not been defined, the conditional is true, and the text to the matching %endif is included in the text of the config file as evaluated by runcpu. Note that the matching %endif may not necessarily be the next %endif; conditionals may be nested.

Given a slightly modified version of the example from earlier:

%define foo
%ifndef %{foo}
Now THIS text will not be included

%ifndef %{bar}
This text WILL be included

The preprocessor would produce the following output:

This text WILL be included

VII.E.3. %if .. %endif

Checking whether or not a macro is defined is quite useful, but it's just a subset of the more general conditional facility available. This general form is

%if expression

The expression is evaluated using a subset of the Perl interpreter, so the possibilities for testing values are fairly broad. For example,

%ifdef %{foo}

is exactly equivalent to

%if defined(%{foo})


%ifndef %{foo}

is exactly equivalent to

%if !defined(%{foo})

Using the general form, it's possible to string conditionals together:

%if defined(%{foo}) && !defined(%{bar}) || %{baz} == 0

To compare versus a string value, you must supply quotes:

%if %{foo} eq 'Hello, Dave.'

You may perform basic math on macro values:

%if %{foo} * 2004 > 3737

You can do basic regular expression matching, as shown in the example for %info.

More precisely, the Perl operations allowed are the :base_core and :base_math bundles, with the ability to dereference and modify variables disallowed. For more details, see the source code for config.pl (the eval_pp_conditional subroutine) and Perl's own Opcode documentation.

Truth and definition
%ifdef %{v} and
%if defined(%{v})
%if %{v}

Conditional expressions are evaluated using perl's rules for Truth and Falsehood.
Therefore 0 (zero) is defined, and is false:

$ cat test_define.cfg
%ifdef %{a}
   OPTIMIZE = -a
%if defined(%{b})   
%if %{c}
$ configpp --config=test_define -S a=1 -S b=1 -S c=1 | grep OPT  
   OPTIMIZE = -a
$ configpp --config=test_define -S a=0 -S b=0 -S c=0 | grep OPT
   OPTIMIZE = -a

VII.E.4. %else

It's possible to get by without the "else" part of the classic "if .. then .. else" trio, but it's not any fun. It works as you'd expect:

%define foo
%ifndef %{foo}
This text will not be included
This text WILL be included (from the else clause)

The preprocessor would produce the following output:

This text WILL be included (from the else clause)

Only one %else per conditional is allowed.

VII.E.5. %elif

%elif is another convenience that's been added. For those not familiar with CPP, it's an "else if" construct. You may have as many of these as you'd like. Given:

%define foo Hello!

%if !defined(%{foo})
This text will not be included
%elif defined(%{bar})
This text won't be included either
%elif '%{foo}' eq 'Hello!'
This text WILL be included (from the second elif clause)
Alas, the else here is left out as well.

The preprocessor would produce the following output:

This text WILL be included (from the second elif clause)

VII.F. Informational Directives

It's often helpful to be able to warn or exit on certain conditions. Perhaps there's a macro that must be set to a particular value, or maybe it's just very highly recommended.

VII.F.1. %warning

%warning does just what you'd expect; when the preprocessor encounters this directive, it prints the text following to stdout and the current log file, along with its location within the file being read, and continues on.

$ cat warning.cfg 
%if !defined(%{somewhat_important_macro})
%   warning You have not defined somewhat_important_macro!
$ cat warning.sh 
configpp --config=warning | grep -C2 WARNING
$ ./warning.sh 

  WARNING: You have not defined somewhat_important_macro!
           (From line 2 of
$ (Notes about examples) 

VII.F.2. %error

Like %warning, %error logs an error to stderr and the log file. Unlike %warning, though, it then stops the run.

Consider a slightly modified version of the previous example:

$ cat error.cfg 
%if !defined(%{REALLY_important_macro})
%   error You have not defined REALLY_important_macro!
$ cat error.sh 
configpp --config=error > /tmp/out
echo runcpu exit code: $?
grep -C2 ERROR /tmp/out
$ ./error.sh 
runcpu exit code: 1

  ERROR: You have not defined REALLY_important_macro!
         (From line 2 of
$ (Notes about examples) 

Unlike a warning, the error will be close to the last thing output. As you can see from the output of echo $? above, runcpu exited with an error code 1.

VII.F.3. %info

The %info directive prints a message preceded by the word INFO:. For example,

$ cat -n info.cfg 
   1  %if !defined(%{chip}) || %{chip} !~ m/(sparc|x86)/
   2  %   error Please use --define chip=sparc or --define chip=x86
   3  %endif
   5  %if  %{chip} eq "sparc"
   6  %   define default_build_ncpus 64
   7  %elif %{chip} eq "x86"
   8  %   define default_build_ncpus 20
   9  %endif
  10  %ifndef   %{build_ncpus}
  11  %   define  build_ncpus   %{default_build_ncpus}
  12  %endif
  14  %info Preprocessor selections: 
  15  %info    .    build_ncpus      %{build_ncpus}
  16  %info    .    chip             %{chip}
  18  make = specmake --jobs=%{build_ncpus} --load-average=%{build_ncpus} 
$ configpp -c info --define chip=sparc | grep -e make -e INFO
INFO: Preprocessor selections:
INFO: .    build_ncpus      64
INFO: .    chip             sparc
make = specmake --jobs=64 --load-average=64

Note in the example above:

The %info directive is new with CPU2017.

VII.F.4. %dumpmacros

The %dumpmacros prints the values of all macros currently defined at the point where it appears. Each macro value is preceded by the word 'DBG:'. For example,

$ cat -n dumpmacros.cfg 
     1	%if !defined(%{chip}) || %{chip} !~ m/(sparc|x86)/
     2	%   error Please use --define chip=sparc or --define chip=x86
     3	%endif
     5	%if  %{chip} eq "sparc"
     6	%   define default_build_ncpus 64
     7	%elif %{chip} eq "x86"
     8	%   define default_build_ncpus 20
     9	%endif
    10	%ifndef   %{build_ncpus}
    11	%   define  build_ncpus   %{default_build_ncpus}
    12	%endif
    14	%dumpmacros
$ configpp -c dumpmacros --define chip=sparc | grep -e DBG
  DBG: build_ncpus: '%{default_build_ncpus}'
  DBG: chip: 'sparc'
  DBG: default_build_ncpus: '64'
  DBG: runcpu: 'configpp -c dumpmacros --define chip=sparc'

The %dumpmacros directive is new with CPU2017.

VIII. Output files - and how they relate to your config file

This section describes how the location and contents of several kinds of output files are influenced by your config file.

VIII.A. Help, I've got too many config files!

It was mentioned above that the HASH section of the config file is written automatically by the tools. Each time your config file is updated, a backup copy is made. Thus your config directory may soon come to look like this:

$ cd $SPEC/config
$ ls tune.cfg*
tune.cfg                        tune.cfg.2017-02-05T124831      tune.cfg.2017-02-05T125733
tune.cfg.2017-02-05T120242      tune.cfg.2017-02-05T125557      tune.cfg.2017-02-05T125738
tune.cfg.2017-02-05T122021      tune.cfg.2017-02-05T125603      tune.cfg.2017-02-05T125744
tune.cfg.2017-02-05T122026      tune.cfg.2017-02-05T125608      tune.cfg.2017-02-05T125749
tune.cfg.2017-02-05T124215      tune.cfg.2017-02-05T125614      tune.cfg.2017-02-05T125756
tune.cfg.2017-02-05T124222      tune.cfg.2017-02-05T125620      tune.cfg.2017-02-05T125802
tune.cfg.2017-02-05T124728      tune.cfg.2017-02-05T125626      tune.cfg.2017-02-05T125807
tune.cfg.2017-02-05T124739      tune.cfg.2017-02-05T125632      tune.cfg.2017-02-05T125812
tune.cfg.2017-02-05T124821      tune.cfg.2017-02-05T125727
$  (Notes about examples) 

If this feels like too much clutter, you can disable the backup mechanism, as described under backup_config. Note that doing so may leave you with a risk of losing the config file in case of a filesystem overflow or system crash. A better idea may be to periodically remove just portions of the clutter, with selective removal of older version; or sweep them to a wastebasket every now and then:

$ cd $SPEC/config
$ mkdir wastebasket
$ mv *cfg.2017* wastebasket/

VIII.B. The log file and verbosity levels

$SPEC/result (Unix) or %SPEC%\result (Windows) contains reports and log files. When you are doing a build, you will probably find that you want to pay close attention to the log files such as CPU2017.001.log. Depending on the verbosity level that you have selected, it will contain detailed information about how your build went.

The CPU2017 tool suite provides for varying amounts of output about its actions during a run. These levels range from the bare minimum of output (level 0) to copious streams of information that are probably useful only to tools developers (level 99). Selecting one output level gives you the output from all lower levels, which may cause you to wade through more output than you might like.

VIII.B.1. Useful Search Strings

When you are trying to find your way through a log file, you will probably find these (case-sensitive) search strings useful:

runcpu:The runcpu command for this log.
RunningPrinted at the top of a run of a benchmark.
#Printed at the top of a run of a benchmark for runs with multiple iterations. Useful for finding the ref workloads in reportable runs.
runtimePrinted at the end of a benchmark run.
CopyTimes for individual copies in a SPECrate run.
BuildingPrinted at the beginning of a benchmark compile.
Elapsed compilePrinted at the end of a benchmark compile.

VIII.B.2. About Temporary Debug Logs

There are also temporary debug logs, such as CPU2017.001.log.debug. A debug log contains very detailed debugging output from the SPEC tools, as if --verbose 99 had been specified.

For a successful run, the debug log will be removed automatically, unless you specify "--keeptmp" on the command line, or "keeptmp=yes" in your config file.

For a failed run, the debug log is kept. The debug log may seem overwhelmingly wordy, repetitive, detailed, redundant, repetitive, and long-winded, and therefore useless. Suggestion: after a failure, try looking in the regular log first, which has a default verbosity level of 5. If your regular log doesn't have as much detail as you wish, then you can examine the additional detail in the debug log.

If you file a support request, you may be asked to send in the debug log.

VIII.B.3. Verbosity levels

The 'level' referred to in the table below is selected either in the config file verbose option or in the runcpu command as in 'runcpu --verbose n'.

Levels higher than 99 are special; they are always output to your log file. You can also see them on the screen if you set verbosity to the specified level minus 100. For example, the default log level is 5. This means that on your screen you will get messages at levels 0 through 5, and 100 through 105. In your log file, you'll find the same messages, plus the messages at levels 106 through 199.

Level What you get
0 Basic status information, and most errors. These messages can not be turned off.
1 List of the benchmarks which will be acted upon.
2 A list of possible output formats, as well as notification when beginning and ending each phase of operation (build, setup, run, reporting).
3 A list of each action performed during each phase of operation (e.g. "Building 176.gcc", "Setting up 253.perlbmk")
4 Notification of benchmarks excluded
5 (default) Notification if a benchmark somehow was built but nevertheless is not executable.
6 Time spent doing automatic flag reporting.
7 Actions to update SPEC-supplied flags files.
10 Information on basepeak operation.
12 Errors during discovery of benchmarks and output formats.
15 Information about certain updates to stored config files
24 Notification of additions to and replacements in the list of benchmarks.
30 A list of options which are included in the hash of options used to determine whether or not a given binary needs to be recompiled.
35 A list of key=value pairs that can be used in command and notes substitutions, and results of env_var settings.
40 A list of 'submit' commands for each benchmark.
Note: If you would like to see all of the submit commands for every copy and every benchmark invocation, with all your variables (such as $BIND) resolved, try runcpu --verbose=40 --fake, or go to the run directory and use specinvoke -n (dry run).
70 Information on selection of median results.
89 Progress comparing run directory checksum for executables.
90 Time required for various internal functions in the tools.
95, 96, 97, 98 Flag parsing progress during flag reporting (progressively more detail)
99 Gruesome detail of comparing hashes of files being copied during run directory setup.
Messages at the following levels will always appear in your log files
100 Various config file errors, such as bad preprocessor directives, bad placement of certain options, illegal characters...
102 Information about output formats that could not be loaded.
103 A tally of successes and failures during the run broken down by benchmark.
106 A list of runtime and calculated ratio for each benchmark run.
107 Dividers to visually block each phase of the run.
110 Elapsed time for each portion of a workload (if an executable is invoked more than once).
120 Messages about which commands are being issued for which benchmarks.
125 A listing of each individual child processes' start, end, and elapsed times.
130 A nice header with the time of the runcpu invocation and the command line used. Information about what happened with your sysinfo program
140 General information about the settings for the current run.
145 Messages about file comparisons.
150 List of commands that will be run, and details about the settings used for comparing output files. Also the contents of the makefile written.
155 Start, end, and elapsed times for benchmark run.
160 Start, end, and elapsed times for benchmark compilation.
180 stdout and stderr from commands run
190 Start and stop of delays
191 Notification of command line used to run specinvoke.

VIII.C. Deciphering an FDO log file

This section demonstrates how to find various portions of a log file using an example build with feedback-directed optimization (FDO). See the section Using Feedback for more information on how to enable FDO.

FDO typically requires two compiles. The first creates an executable image with instrumentation. You run the program with a "training" workload, the instrumentation observes it, and a profile is written. The second compile then uses the profile to improve optimization. SPEC CPU makes all of this relatively easy. Here's a config file that builds 519.lbm_r with FDO:

FDO Example 8: FDO tracks in the log file
$ cat mat.cfg
iterations            = 1
label                 = blue271
runlist               = 519.lbm_r
size                  = test
teeout                = yes
tune                  = peak
   CC                 = gcc
   OPTIMIZE           = -O2
   PASS1_OPTIMIZE     = -fprofile-generate
   PASS2_OPTIMIZE     = -fprofile-use

$ cat mat.sh
runcpu --config=mat | grep -e Training -e lbm.c -e .log
$ ./mat.sh 
gcc -c -o lbm.o -DSPEC -DSPEC_CPU -DNDEBUG -DSPEC_AUTO_SUPPRESS_OPENMP -O2 -fprofile-generate lbm.c
Training 519.lbm_r with the train workload
gcc -c -o lbm.o -DSPEC -DSPEC_CPU -DNDEBUG -DSPEC_AUTO_SUPPRESS_OPENMP -O2 -fprofile-use      lbm.c
The log for this run is in /vampir/cpu2017/rc4/result/CPU2017.838.log
$  (Notes about examples) 

The PASSn_OPTIMIZE lines cause FDO to happen. The grep picks out a few lines of interest: compile using -fprofile-generate; run the training workload; recompile with -fprofile-use.

For much more detail, you can examine the log file. In this section, the log file from the above command is used (with light editing for readability, e.g. adjusted white space). Useful search strings are bolded.

'runcpu:' Search for runcpu: to verify that we have the correct log file:

$ go result                 (go)
$ grep runcpu: CPU2017.837.log 
runcpu: runcpu --config=mat

Yes, this looks like the right log. Bring it up in an editor.

'Building' If you search for Building, you will find what was written to Makefile.deps (dependencies) and Makefile.spec (the actual Makefile). In this case, there are no dependencies.

  Building 519.lbm_r peak blue271: (build_peak_blue271.0000) [2017-02-14 05:26:09]
Wrote to makefile 

# End dependencies
# These are the build dependencies
Wrote to makefile 
SOURCES= lbm.c main.c

CC               = gcc
OPTIMIZE         = -O2
OS               = unix
PASS1_OPTIMIZE   = -fprofile-generate
PASS2_OPTIMIZE   = -fprofile-use 

'specmake' Search for specmake. There will be several hits. For a more narrow search, use  ^specmake.*build  if your editor allows it. Below,

specmake clean removes old files
specmake build does the actual build
specmake options generates a summary list of options from the build.

Just after the specmake build is the first set of actual compile commands. There are several mandatory flags added by the toolset. For example, a flag is added to suppress OpenMP directives, because this is a SPECrate run. You can also see -O2 -fprofile-generate as requested in the config file.

Issuing make.clean command 'specmake --output-sync clean'
specmake --output-sync clean
Start make.clean command: 2017-02-14 05:26:09 (1487067969.8877)
rm -rf *.o  lbm.out
find . \( -name \*.o -o -name '*.fppized.f*' -o -name '*.i' -o -name '*.mod' \) -print | xargs rm -rf
rm -rf lbm_r
rm -rf lbm_r.exe
rm -rf core
Stop make.clean command: 2017-02-14 05:26:10 (1487067970.06632)
Elapsed time for make.clean command: 00:00:00 (0.178615808486938)
Issuing fdo_make_pass1 command 'specmake --output-sync --always-make build FDO=PASS1'
specmake --output-sync --always-make build FDO=PASS1
Start fdo_make_pass1 command: 2017-02-14 05:26:10 (1487067970.06816)
gcc -c -o lbm.o  -DSPEC -DSPEC_CPU -DNDEBUG -DSPEC_AUTO_SUPPRESS_OPENMP -O2 -fprofile-generate  lbm.c
gcc -c -o main.o -DSPEC -DSPEC_CPU -DNDEBUG -DSPEC_AUTO_SUPPRESS_OPENMP -O2 -fprofile-generate  main.c
gcc       -O2  -fprofile-generate        lbm.o main.o             -lm         -o lbm_r
Stop fdo_make_pass1 command: 2017-02-14 05:26:10 (1487067970.73817)
Elapsed time for fdo_make_pass1 command: 00:00:00 (0.670012950897217)
Issuing options1 command 'specmake --output-sync options FDO=PASS1'
specmake --output-sync options FDO=PASS1
Start options1 command: 2017-02-14 05:26:10 (1487067970.73965)

If you are curious about the build command

specmake --output-sync --always-make build FDO=PASS1

the details are:

specmake is GNU Make.
--output-sync consolidates compiler stdout and stderr messages by module
--always-make rebuilds all targets
build selects the target which is the benchmark executable in $SPEC/benchspec/Makefile.defaults
FDO=PASS1 turns on the Feedback Directed Optimization switches matching PASS1 in Makefile.defaults
For more information on how SPEC CPU2017 builds work, see the Make Variables document.

'Training' Search for Training. At the top of this section, runcpu forces single threading (effectively disabling OpenMP) because this is a SPECrate run. It takes a bit over 26 seconds to run the training workload.

Training 519.lbm_r with the train workload
OpenMP environment variables removed: None
OpenMP environment variables in effect:
    OMP_NUM_THREADS       '1'
    OMP_THREAD_LIMIT      '1'
Pre-run environment changes:
    'OMP_NUM_THREADS' added: (value now '1')
    'OMP_THREAD_LIMIT' added: (value now '1')
Commands to run (specinvoke command file):
  -N C
  -C /vampir/cpu2017/rc4/benchspec/CPU/519.lbm_r/build/build_peak_blue271.0000
  -o lbm.out -e lbm.err ../build_peak_blue271.0000/lbm_r 300 reference.dat 0 1 100_100_130_cf_b.of 
Specinvoke: /vampir/cpu2017/rc4/bin/specinvoke -d /vampir/cpu2017/rc4/benchspec/CPU/519.lbm_r/build...
Issuing command '/vampir/cpu2017/rc4/bin/specinvoke -d /vampir/cpu2017/rc4/benchspec/CPU/519.lbm_r/...
/vampir/cpu2017/rc4/bin/specinvoke -d /vampir/cpu2017/rc4/benchspec/CPU/519.lbm_r/build/build_peak_...
Start command: 2017-02-14 05:26:11 (1487067971.19247)
Stop command: 2017-02-14 05:26:37 (1487067997.72349)
Elapsed time for command: 00:00:26 (26.5310192108154)
Workload elapsed time (copy 0 workload 1) = 26.366245 seconds
Copy 0 of 519.lbm_r (peak train) run 1 finished at 2017-02-14 05:26:37. Total elapsed time: 26.3662

Notice the section 'Commands to run'. The key line begins with -o which tells specinvoke to

send output to lbm.out (from -o lbm.out)
send errors to lbm.err (from -e lbm.err)
run the benchmark executable binary lbm_r (from ..path../lbm_r)
with arguments 300 reference.dat 0 1 100_100_130_cf_b.of

Some benchmarks run more than once and have multiple lines that start with -o in the corresponding location. For example, the compression benchmark 557.xz_r compresses several different kinds of input stream. You can see the inputs that SPEC has provided for training purposes in the directories nnn.benchmark/data/train/input and nnn.benchmark/data/all/input. In some cases, the training workloads required significant development effort. As a user of the suite you don't have to worry about that; you simply apply them.

About training fidelity: SPEC is aware that there is some variation in the fidelity between benchmark training workloads vs. the timed "ref" workloads. One might argue that a training data set is "not good enough": compilers might guess incorrectly if they rely on it; or argue that a set is "too good": compilers can guess too easily. In the real world also, not just in benchmarking, there is such variation, because training data sets can be difficult to find. If you would like to experiment with different training workloads, see the comments on using a sandbox in the document Avoiding runcpu; see the utility convert_to_development; and please be reminded that you must not represent measurements with other workloads as official SPEC metrics. [link to this paragraph]

'specmake' (again): If you search again for specmake, you will come to the second specmake build, which uses the generated profile.

Issuing fdo_make_pass2 command 'specmake --output-sync --always-make build FDO=PASS2'
specmake --output-sync --always-make build FDO=PASS2
Start fdo_make_pass2 command: 2017-02-14 05:26:38 (1487067998.25607)
gcc -c -o lbm.o  -DSPEC -DSPEC_CPU -DNDEBUG -DSPEC_AUTO_SUPPRESS_OPENMP -O2 -fprofile-use lbm.c
gcc -c -o main.o -DSPEC -DSPEC_CPU -DNDEBUG -DSPEC_AUTO_SUPPRESS_OPENMP -O2 -fprofile-use main.c
gcc       -O2  -fprofile-use        lbm.o main.o             -lm         -o lbm_r
Stop fdo_make_pass2 command: 2017-02-14 05:26:38 (1487067998.92049) 

And that's it. The tools did most of the work; the user simply set the PASSn flags in the config file.

VIII.D. Help, I've got too many log files!

If you do many builds and runs, you may find that your result directory gets too cluttered. Within a result directory, all output formats other than .rsf can be regenerated from your .rsf files. Therefore, you could reduce clutter by deleting HTML, PDF, and other reports. You can delete old .debug logs unless you plan to submit a support request.

Still feel cluttered? A simple solution is to move your result directory aside, giving it a new name. Don't worry about creating a new directory; runcpu will do so automatically. You should be careful to ensure no surprises for any currently-running users. If you move result directories, it is a good idea to also clean temporary directories at the same time.

cd $SPEC
mv result old-result
rm -Rf tmp/
cd output_root     # (If you use an output_root)
rm -Rf tmp/

Windows users: Windows users can achieve similar effects using the rename command to move directories, and the rd command to remove directories.

VIII.E. Finding the build directory

As described under "About Disk Usage" in runcpu.html, the CPU2017 tools do the actual builds and runs in newly created directories. The benchmark sources are never modified in the src directory.

The build directories for a benchmark are located underneath that benchmarks' top-level directory, typically $SPEC/benchspec/CPU/nnn.benchmark/build (Unix) or %SPEC%\benchspec\CPU\nnn.benchmark\build (Windows).

(If you are using the output_root feature, then the first part of that path will change to be your requested root instead of SPEC.)

The build directories have logical names, typically of the form build_<tune>_<label>.0000. Continuing the FDO log example, this directory was created:

$ go lbm build                (go)
$ ls -ld *blue*
drwxrwxr-x  40 mat staff  1360 Feb 14 05:26 build_peak_blue271.0000

On Windows, you would say cd %SPEC%\benchspec\CPU\519.lbm_r\build followed by dir build*.

If the directory build_<tune>_<label>.0000 already exists when a new build is attempted for the same tuning and label, the directory will be re-used, unless:

In such cases, the 0000 will be incremented until a name is generated that is available. You can find locked directories by searching for lock=yes in the file $SPEC/benchspec/CPU/<nnn.benchmark>/run/list (Unix) or %SPEC%\benchspec\CPU\<nnn.benchmark>\run\list (Windows).

When more than one build directory has been created for a given tuning and label, you may need to trace the directory back to the specific build attempt that created it. You can do so by searching for the directory name in the log files:

$ grep build_peak_blue271.0000 *log | grep Building
CPU2017.838.log:  Building 519.lbm_r peak blue271: (build_peak_blue271.0000) [2017-02-14 05:26:09]

In the above example, the grep command locates log #838 as the log that corresponds to this run directory. On Windows, of course, you would use findstr instead of grep.

VIII.F. Files in the build directory

A variety of files are output to the build directory. Here are some of the key files which can usefully be examined:

Makefile.spec The components for make that were generated for the current config file with the current set of runcpu options.
options.out For 1 pass compile: build options summary.
options1.out For N pass compile: summary of first pass.
options2.out For N pass compile: summary of second pass.
make.out For 1 pass compile: detailed commands generated.
fdo_make_pass1.out For N pass compile: detailed commands generated for 1st pass.
fdo_make_pass2.out For N pass compile: detailed commands generated for 2nd pass.

VIII.G. For more information

For more information about how the run directories work, see the descriptions of specinvoke, specmake, and specdiff in utility.html.

Appendix A: Auxiliary Benchmark Sets

To reference a set in the benchmark specifier, typically one uses a set that corresponds to a metric, or else default:

intrate SPECrate 2017 Integer
fprate SPECrate 2017 Floating Point
intspeed SPECspeed 2017 Integer
fpspeed SPECspeed 2017 Floating Point
default Any benchmark
 # examples
     OPTIMIZE = -O2
     OPTIMIZE = -O3
     LDOPTIONS = --shared

This appendix describes the Auxiliary Benchmark Sets.
These sets are available but require caution. They are not recommended unless you are an expert user.

specspeed    intspeed plus fpspeed 
specrate     intrate  plus fprate
cpu          all benchmarks: the same benchmarks as 'default'; higher priority than 'default'
openmp       benchmarks with OpenMP directives 
serial_speed specspeed benchmarks without OpenMP 
any_fortran  benchmarks using Fortran (in whole or in part) 
pure_fortran benchmarks using Fortran and no other language
  .          Any other benchmark set from
  .             ls $SPEC/benchspec/CPU    (Unix)
  .             dir %SPEC%\benchspec\CPU  (Windows)

Considerations when using Auxiliary Benchmarks Sets:

Warning: Usually, you should use only benchmark sets that correspond directly to SPEC metrics: intrate, intspeed, fprate, or fpspeed, because:

  1. The preenv feature can only be used with sections that correspond to metrics.
  2. The rules for base require consistency within a metric.
  3. Other sets add another level of complexity to the precedence rules.

If you use a set that is not a metric, care must be taken to avoid accidental rule violations, to avoid surprises from missing preenv variables, and to avoid precedence surprises.

Example: The diff command shows that only one line differs between two config files. benchmark_set.wrong.cfg falls into two traps.

$ diff --width 110 --side-by-side  --left-column \
>   benchmark_set.correct.cfg                          benchmark_set.wrong.cfg 
flagsurl       = $[top]/config/flags/gcc.xml          (
output_format  = text                                 (
output_root    = /tmp/benchmark_set                   (
runlist        = intspeed                             (
size           = test                                 (
default:                                              (
   CC          = gcc      -std=c99                    (
   CXX         = g++      -std=c++03                  (
   FC          = gfortran -std=f2003                  (
   # How to say "Show me your version"                (
   CC_VERSION_OPTION    = -v                          (
   CXX_VERSION_OPTION   = -v                          (
   FC_VERSION_OPTION    = -v                          (
default=base:                                         (
   OPTIMIZE             = -O1                         (
intspeed:  # correct                                  | openmp:    # wrong wrong wrong wrong wrong wrong
   EXTRA_OPTIMIZE       = -DSPEC_OPENMP -fopenmp      (
   preENV_OMP_STACKSIZE = 120M                        (

The script runs both in fake mode, and searches the output for the OMP_STACKSIZE setting:

$cat benchmark_set.sh 
runcpu --fakereportable --config=benchmark_set.correct | grep speed.txt
runcpu --fakereportable --config=benchmark_set.wrong   | grep speed.txt
cd /tmp/benchmark_set/result
grep STACKSIZE *txt
$ ./benchmark_set.sh 
    format: Text -> /tmp/benchmark_set/result/CPU2017.001.intspeed.txt
    format: Text -> /tmp/benchmark_set/result/CPU2017.002.intspeed.txt
CPU2017.001.intspeed.txt:    OMP_STACKSIZE = "120M"
$  (Notes about examples) 

Notice that only benchmark_set.correct.cfg gets the desired OMP_STACKSIZE. (The grep does not match CPU2017.002.intspeed.txt.)

More importantly, benchmark_set.correct.cfg compiles all C benchmarks the same way (as required). From CPU2017.001.intspeed.txt:

                              Base Optimization Flags
C benchmarks: 
     -std=c99 -O1 -DSPEC_OPENMP -fopenmp

The consistency rule is violated by benchmark_set.wrong.cfg and results using it would be non-compliant. From CPU2017.002.intspeed.txt:

                              Base Optimization Flags
C benchmarks:
 600.perlbench_s: -std=c99 -O1
       602.gcc_s: Same as 600.perlbench_s
       605.mcf_s: Same as 600.perlbench_s
      625.x264_s: Same as 600.perlbench_s
        657.xz_s: -std=c99 -O1 -DSPEC_OPENMP -fopenmp

Appendix B. Troubleshooting

When something goes wrong, here are some things to check:

  1. Are there any obvious clues in the log file? Search for the word "Building". Keep searching until you hit the next benchmark AFTER the one that you are interested in. Now scroll backward one screen's worth of text.

  2. Did your desired switches get applied? Go to the build directory, and look at options*out.

  3. Did the tools or your compilers report any errors? Look in the build directory at *err.

  4. What happens if you try the build by hand? See the section on specmake in utility.html.

  5. If an actual run fails, what happens if you invoke the run by hand? See the information about "specinvoke -n" in utility.html

  6. Do you understand what is in your path, and why? Sometimes confusion can be greatly reduced by ensuring that you have only what is needed, avoiding, in particular, experimental and non-standard versions of standard utilities.

    Note: on Windows systems, SPEC recommends that Windows/Unix compatibility products should be removed from the %PATH% prior to invoking runcpu, in order to reduce the probability of certain difficult-to-diagnose error messages.

  7. Try asking the tools to leave more clues behind, with keeptmp.

Appendix C. Obsolete/removed items

Option Notes
allow_extension_override Now called allow_label_override.
company_name Obsolete. You are probably looking for hw_vendor, tester, or test_sponsor. See the example under test_sponsor.
ext Obsolete. For CPU2017, use label
hw_cpu_char Obsolete. For CPU2006, was used primarily as an additional field to describe MHz. For CPU2017, there are two fields: hw_cpu_nominal_mhz and hw_cpu_max_mhz
hw_cpu_mhz Obsolete. For CPU2017, there are two fields: hw_cpu_nominal_mhz and hw_cpu_max_mhz
hw_cpu_ncoresperchip Obsolete. For CPU2017, specify the total number of enabled cores (hw_ncores) and the total number of enabled chips (hw_nchips); trust the reader to do the division. (If for some reason this is not possible, please use the free-form notes to explain.)
hw_fpu Obsolete.
mach Obsolete. Was removed in CPU2017 because it was rarely used; and it tended to increase both complexity and confusion.
machine_name Obsolete since retirement of CPU95.
max_active_compares Obsolete. Was removed in CPU2017 primarily because of complexity considerations when implementing the new parallel setup methds.
rate Obsolete, because of the change to how benchmarks are defined in CPU2017 (SPECrate = 5xx, SPECspeed = 6xx). See the discussion of removed items in Using SPEC CPU2017.
speed Obsolete, because of the change to how benchmarks are defined in CPU2017 (SPECrate = 5xx, SPECspeed = 6xx). See the discussion of removed items in Using SPEC CPU2017.

This flag has been obsolete since the release of SPEC CPU2006 V1.1. If your compiler creates binaries that use multiple processors, you should instead make sure that your flags file indicates this by setting the flag attribute parallel="yes". The tools will notice if you use a flag that has this flag attribute set, and if so will automatically include that information in the reports.

See the discussion of parallel reporting.

test_date When the tests were run. This field is populated automatically based on the clock in the system under test. Setting this in the config file will generate a warning and the setting will be ignored. If your system clock is incorrect, then the value may be edited in the raw file (see utility.html). It is better to avoid the necessity to edit, by setting your system clock properly.
tester_name Obsolete. You are probably looking for hw_vendor, tester, or test_sponsor. See the example under test_sponsor.
VENDOR This field is obsolete, and has been since SPEC CPU92 was retired.

SPEC CPU®2017 Config Files: Copyright © 2017 Standard Performance Evaluation Corporation (SPEC)