Skip navigation

Standard Performance Evaluation Corporation

 
 

SPEC Glossary

Definitions for:


Definitions

active idle

The state in which the SUT must be capable of completing all workload transactions. In a virtualized environment, all virtual machines necessary to support the peak load of the result must be powered on and the corresponding servers ready to respond to client requests.

array

A collection of data items usually laid out linearly in memory for simple access with an integer index from the base of the array.

Many scientific applications use arrays to contain the dataset that they are analyzing. The larger the dataset, the larger the array size needs to be to fit the data.

associate member

A membership class available to non-profit organizations at a significantly reduced cost. Allows for access to the benchmarks under development on the condition of significant involvement in the development process. Each group has their own rules and requirements regarding associate membership, contact SPEC for details.

availability date

The date upon which that part of the system becomes generally available, that is available to anyone willing to pay the appropriate price and take immediate delivery.

baseline

For SPEC's purposes, "baseline" refers to a configuration that is more general and hopefully simpler than one tuned for a specific benchmark.

Usually a "baseline" configuration needs to be effective across a variety of workloads, and there may be further restrictions such as requirements about the ease-of-use for any features utilized.

Commonly "baseline" is the alternative to a "peak" configuration.

benchmark

A reference point.

Originally: a mark on a workbench used to compare the lengths of pieces so as to determine whether one was longer or shorter than desired.

For computers: a "benchmark" is a test, or set of tests, designed to compare the performance of one computer system against the performance of others.

Note: a benchmark is not necessarily a capacity planning tool. That is, benchmarks may not be useful in attempting to guess the correct size of a system required for a particular use. In order to be effective in capacity planning, it is necessary for the test to be easily configurable to match the targeted use. In order to be effective as a benchmark, it is necessary for the test to be rigidly specified so that all systems tested perform comparable work. These two goals are often at direct odds with one another, with the result that benchmarks are usually useful for comparing systems against each other, but some other test is often required to establish what kind of system is appropriate for an individual's needs.

benchmark sponsor

Every benchmark code of SPEChpc96 has a technical advisor who is knowledgeable about the code and the scientific/engineering problem, possibly with the help of experts outside the SPEC organization.

binary

To be specific, binary refers to a numeric representation that is comprised of (frequently very long) sequences of only two values, usually '0' and '1'. Deep down at their very core, most computers really only understand '0' and '1' (or in other words some little bit of information is either "off" or "on"); thus, the term binary is frequently used to describe anything already translated to the form that is closest to what the system understands natively.

chip

An integrated circuit chip, a collection of interconnected microminiature electronic components, a microprocessor. The physical package containing one or more computational units or package containing one or more "cores".

clients

One or several servers that are used to initiate benchmark transactions and record their completion. In most cases, a client simulates the work requests that would normally come from end users, although in some cases the work requests come from background tasks that are defined by the benchmark workloads to be simulated instead of being included in the SUT.

compiler

A program that translates (presumably) human-readable source code into a form that is native for a particular machine.

core

A hardware execution pipeline and associated structures that actually perform the execution of a process or thread. The core set of architectural, computational processing elements that provide the functionality of a CPU.

CPU (Central Processing Unit)

The components within a computer that allow for the interpretation and execution of computational tasks. May be used to refer to a processor chip or a processor core. Term deprecated by SPEC in 2004, replaced by physical chips, cores, and hardware threads.

CPU2000

CPU2000 is the current version of the CPU component benchmark suite from SPEC. It replaces CPU95.

CPU2006

CPU2006 is the name given to the ongoing effort to replace the current CPU2000 product.

CPU92

CPU92 is a now outdated CPU-component benchmark suite from SPEC. This was replaced by CPU95.

CPU95

CPU95 is an earlier version of the CPU component benchmark suite from SPEC, which replaced CPU92 and the even older CPU89.

This suite has in turn been replaced by CPU2000.

CPU intensive

A term that SPEC uses often to mean applications that are primarily bound by the available processing power. Typically, these spend most of their time performing calculations or comparisons or transformations, and do little or no I/O and spend very little time in the operating system.

dataset

The set of inputs for a particular benchmark. There may be more than one dataset available for each benchmark each serving a different purpose (e.g. measurement versus testing) or configured for different problem sizes (small, medium, large, ...).

double precision

A level of floating point accuracy that usually requires twice the space for each value than does single precision, but provides considerably more precision.

For most systems running the SPEC CPU tests from the OSG (e.g. CPU2000), double precision implies a 64 bit value.

executable

As an adjective, executable means that the described item can be executed. In computer talk, executable has been also used as a noun, where it means "an executable program" or in other words, something that is ready to run without further modification. Commonly, the term executable is used to refer to the binary file that is the final result of compiling source code.

fileset

A pre-defined set of files that are used within a benchmark workload. Usually a fileset has specific characteristics that are relevant to how the benchmark performs its work.

floating point

A class of arithmetic, typically used in scientific applications. Actually much like the values displayed by your calculator, the values can range from very large down to minute fractions but only the first several digits are available.

Floating point is commonly used when dealing when the values being calculated can be very large, into the billions, or else have involve fractions; e.g. the number of miles from Earth to the next galaxy (billions and billions), or the precise temperature of a feverish baby (101.8). Floating point is the alternative to integer.

For the purposes of classification for the CPU benchmarks, SPEC classifies an application to be a floating point application, if that application typically spends 10% or more of its time in calculating floating point values.

fractional tile

A tile using a "load scale factor" of less than 1.0 that is applied to all of the VMs within that tile. If used, the load scale factor must be between 0.1 and 0.9 in 0.1 increments (e.g. 0.25 would not be allowed).

full disclosure report

The complete documentation of a benchmark's results along with the relevant system and benchmark configuration information. There should be sufficient detail and coverage for someone else to be able to reproduce the tests.

Each result available on this server has such a disclosure available.

geometric mean

A mean ("average" value) that is obtained through the use of multiplication and Nth roots rather than by addition and division. Thus to calculate: take the Nth root (the power of 1/N) of the product of all N terms.

The geometric mean has the interesting property that a certain percentage change in any one of the terms has the same effect as the same percentage change in any of the other terms, and even successive changes in the same term will have the same effect as if the changes were instead spread over other terms. What this means in benchmarking terms is that a 10% improvement in one benchmark has the same effect on the overall mean as a 10% improvement on any of the other benchmarks, and that another 10% improvement on that benchmark will have the same effect as the last 10% improvement. Thus no one benchmark in a suite becomes more important than any of the others in the suite.

hardware partitioning

Subdivides a physical server into fractions, each of which can run an operating system. These fractions are typically created with coarse units of allocations such as whole processors or physical boards.

hardware virtualization

The host program provides the appearance of a separate, isolated, and complete computing system (a virtual machine) to more than one guest. Each guest runs its own operating system.

harness

The framework containing scripts and programs used to control and monitor the test.

HPG

High-Performance Group.

One of several groups within the SPEC organization. HPG has created the benchmark suite SPEChpc96, aiming at high-end machines including both shared-memory and distributed-memory architectures.

HPSC

High-Performance Steering Committee.

Executive group within HPG. Currently HPG and HPSC are the same (i.e., all HPG members are part of HPSC)

HTTP

HyperText Transfer Protocol.

The protocol over TCP/IP by which the WWW communicates. The specifications for HTTP is available from the World Wide Web Consortium which develops such standards.

hypervisor (virtual machine monitor)

Software (including firmware) that allows multiple VMs to be run simultaneously on a server.

integer

A class of arithmetic, commonly used in computers. Integer arithmetic deals only in whole numbers; e.g. 1, 2, 99, 4563. Any calculation that does not result in a nice whole number is truncated back to a nice whole number, the fractional part is thrown away; e.g. 9 / 4 = 2 and not 2.25 or two and a quarter.

Typically, computers can perform integer arithmetic more quickly than they can any other form of arithmetic, so most programs do as much work as they can in integer. However, most computer have significant limits on the values they can manage in integer format. Besides the lack of fractions, many computers cannot handle integer values beyond the millions. Thus integers can be used to count time, or to keep track of all the pennies in your bank account. However, most scientific applications deal with large values or need to be more precise than just throwing away the fractions. These kind of applications then make use of floating point arithmetic.

For the SPEC CPU benchmarks, applications are classified as "integer" if they spend less than 1% of their time performing floating point calculations (which covers most non-scientific applications, e.g. compilers, utilities, simulators, etc.).

LADDIS

The name of a performance group that originated the benchmark that came to be known as SPEC SFS.

The name is an acronym of the companies from the original members:

  • Legato
  • Auspex
  • Data General
  • Digital
  • Interphase
  • Sun
libraries

In computer terms, a library is a collection of subroutines provided by the operating system or development environment that can be used to perform certain common tasks; e.g. read something off of disk, create a window on the display, sort an array of values, calculate the cosine of a value, etc.).

license agreement

An agreement that each licensee accepts prior to use of a product. In the SPEC case, this agreement covers what can and cannot be done with the SPEC benchmarks; usually stating that any public use of any SPEC metrics must come from tests that were in complete agreement with the run and reporting rules for that benchmark.

load generator

Something that provides part of a workload to a SUT for a benchmark. Commonly in SPEC usage, this term applies to a "client" system that is used to drive the SUT over a LAN; however this term can also be used to describe a process (either on a "client" or the SUT) which is generating a load for the benchmark.

load level

For any benchmark which submits various amounts of work to a SUT, a load level is one such amount of work. This is usually in terms of expected throughput; such as "a load level of 100 operations per second was tried, but the SUT was not able to keep up and was only able to complete 80..."

metric

The final results of a benchmark. The significant statistics reported from a benchmark run. Each benchmark defines what are valid metrics for that particular benchmark.

multi-core

A microprocessor that contains two or more CPU cores. For example, a dual core microprocessor can execute two software tasks with independent address spaces simultaneously.

object code

Object code is commonly the product of running source code through a compiler. It is usually a binary representation of the program statements translated into a form that is understood natively by the processor.

ops

Operations Per Second.

Usually the units of a throughput metric (for example in SPEC SFS97_R1).

The average number of operations performed per second, where the "operation" has been specified by the benchmark standard.

OSG

The Open Systems Group within SPEC.

This group works on benchmarks for evaluating the performance of systems running 'open' (or publicly defined) operating systems (e.g. UNIX and its derivations, as well as NT and VMS).

See the OSG home page.

OSSC

Open Systems Steering Committee.

Executive decision making body within the OSG.

OS virtualization

The host program provides the appearance of an operating system to more than one guest. Each guest has only its own user-space instance of the operating system. The kernel is shared between the host and all guests.

parallelizable

The property of a computer program, or program segment, that allows for the parallel execution of parts of the same program. Parallel programming covers a wide range of degrees; from the very small grain (e.g. similar operations on multiple elements of the same array or matrix), to large grain (e.g. simultaneous execution of unrelated procedures).

peak

For SPEC's purposes, a "peak" configuration is one where the configuration is tuned especially to get the best result for a single, specific workload.

Typically, this demonstrates the highest performance levels achievable.

"Peak" is often used in combination with "baseline" configurations.

performance neutral

Performance neutral means that there is no significant difference in performance. For example, a performance neutral source code change would be one which would not have any significant impact on the performance as measured by the benchmark.

portability

Portability flags or changes are those which are necessary for the correct execution of a benchmark. That is, the benchmark will not run or will produce the wrong output without these flags or changes.

portable

In computer terms, portable means that the code in question can be easily taken to a different system and made to work there. Code that is dependant upon quirks or specific resources of a certain system are usually considered not to be portable because of the difficulties in finding means of supporting these dependencies on the new system. The use of standardized definitions and interfaces, e.g. ANSI-C and Posix, greatly aids portability because the difficult dependencies are hidden behind the standardized interfaces and the difficulties are shifted from the programmer to the system provider.

power monitoring systems

The power analyzer, temperature monitor, and system(s) running the applications that control the collection and recording of power information for the benchmark. They are not a part of the SUT.

process

A separate executable, loadable module. A collection of code, data, and other system resources, including at least one thread of execution, which performs a data processing task. A running application, its address space, and its resources. A task with high protection and isolation against other processes.

processor

A CPU, either the chip or the core elements within a chip. Another term SPEC deprecated.

reference time

The amount of time that a particular benchmark took to run on a specific reference platform.

reporting rules

The set of benchmark rules that defines what constitutes a valid full disclosure for that benchmark. Usually these define what parts of the benchmark configuration and the system configuration(s) that need to be detailed and disclosed.

response time

The amount of time from when an action is requested until the time that the request completes and is returned to the requestor.

result

The value of the primary metric being reported for the benchmark.

run rules

The set of benchmark rules that defines what constitutes a valid test with that benchmark. Usually these define legal configurations, experimental limitations, and any operating constraints.

scalable (speed)

The percentage increase in measured speed of execution as additional resources of a given type are increased.

scalable (throughput)

The percentage increase in measured throughput as additional resources of a given type are increased.

script

A file that contains a sequence of instructions for an interpreter, the "script" for that interpreter to follow.

SERT

The Server Efficiency Rating Tool (SERT) was created by Standard Performance Evaluation Corporation (SPEC) at the request of the US Environmental Protection Agency. It is intended to measure server energy efficiency, initially as part of the second generation of the US Environmental Protection Agency (EPA) ENERGY STAR for Computer Servers program. Designed to be simple to configure and use via a comprehensive graphical user interface, the SERT uses a set of synthetic worklets to test discrete system components such as memory and storage, providing detailed power consumption data at different load levels.

server

A host system that is capable of supporting a single operating system or hypervisor. The server consists of one or more enclosures that contains hardware components such as the processors, memory, network adapters, storage adapters, and any other components within that enclosure, as well as the mechanism that provides power for these components. In the case of a blade result, the server includes the blade enclosure.

SFS93

Known as SPEC SFS, SFS93 is the NFS server benchmark which evolved from LADDIS.

SFS97

SPEC SFS97 is the NFS server benchmark which replaced SFS93.

SFS97_R1

SFS97_R1 is version 3 of the NFS benchmark, replacing the SFS97 suite.

shell

A UNIX term for a command interpreter and its environment. Thus, typically a program that supports the interpretation and execution of commands.

single precision

A level of floating point accuracy that usually requires half the space for each value than does double precision, but provides considerably less precision.

For most systems running the SPEC CPU tests from the OSG (e.g. CPU95), single precision implies a 32 bit value.

source code

The human readable form of a computer program. This is typically the form in which the program is written, read, and modified by its human author(s).

SPEC

Standard Performance Evaluation Corporation.

SPEC is an organization of computer industry vendors dedicated to developing standardized benchmarks and publishing reviewed results

See SPEC's home page.

SPEC95

A common (mis)name for the CPU95 benchmarks. Also, SPEC89 implies CPU89, SPEC92 should be CPU92, and SPEC2000 is CPU2000

SPECchem96

Official name of the Gamess application of SPEChpc96; an application representative of computations used by the chemical industry.

SPEChpc96

The first benchmark suite released by SPEC/HPG. Includes the two applications Seismic and Gamess.

SPECjvm98

SPECjvm98 is the current Java Virtual Machine benchmark suite from SPEC.

SPECmark

SPECmarks were the metrics for SPEC's original CPU89 benchmarks. Now, the term is often used to refer collectively to the CPU95 ratio speed metrics.

SPECrate

A "SPECrate" is a throughput metric based on the SPEC CPU benchmarks (such as SPEC CPU95).

This metric measures a system's capacity for processing jobs of a specified type in a given amount of time.

Note: This metric is used the same for multi-processor systems and for uni-processors. It is not necessarily a measure of how fast a processor might be, but rather a measure of how much work the one or more processors can accomplish.

The other kind of metrics from the SPEC CPU suites are SPECratios, which measure the speed at which a system completes a specified job.

SPECratio

A measure of how fast a given system might be.

The "SPECratio" is calculated by taking the elapsed time that was measured for a system to complete a specified job, and dividing that into the reference time (the elapsed time that job took on a standardized reference machine).

This measures how quickly, or more specifically: how many times faster than a particular reference machine, one system can perform a specified task.

"SPECratios" are one style of metric from the SPEC CPU benchmarks, the other are SPECrates.

SPECseis96

Official name of the Seismic application of SPEChpc96; an application representative of computations used by the seismic industry.

SPECweb2005

SPECweb2005 is a standardized performance test for WWW servers, the successor to SPECweb99 and SPECweb99_SSL. The benchmark consists of different workloads (both SSL and non-SSL), such as banking and e-commerce, and writes dynamic content in scripting languages to more closely model real-world deployments. The web server also communicates with a lightweight backend to simulate an application/database server.

SPECweb96

SPECweb96 is SPEC's first attempt at a benchmark for WWW servers. It measures a servers ability to handle HTTP/1.0 GET requests from a number of external "client" drivers.

SPECweb99

SPECweb99 is one of the current web server benchmarks, which replaced the SPECweb96 product.

sponsor

In the OSG: The entity that has accepted the license agreement. In other words, the people who are responsible for ensuring that the results were obtained in accordance with any existing run and reporting rules.

For the HPG, see benchmark sponsor who is a technical advisor for a particular benchmark.

steering committee

Part of the SPEC bureaucracy; each free-standing group within SPEC has a steering committee which acts as the key decision making body, with full membership votes typically being reserved for benchmark ratifications and elections.

SUT

System Under Test

The server and performance-critical components that execute the defined workloads. Storage hardware and network hardware needed to complete the requested work is included. Client hardware used to initiate and monitor the workflow is not included.

TCP/IP

A networking protocol developed for the creation of a robust "internet" being a connection across a variety of local networking mechanisms. Or: the protocol used to connect to and through what is known today as the Internet (or just the 'Net).

The internet uses a layered architecture with several protocols. The TCP (Transmission Control Protocol) defines session based communications, and the IP (Internet Protocol) addresses the lower level issues of packet fragmentation and routing.

testbed

The entire test setup, including the SUT and any external systems used to drive or coordinate or monitor the benchmark.

thread

A component or resource of a process, threads of execution occurs within a process; not a separate executable, loadable module; shares the address space of the originating process. A thread consists of a program counter, a set of registers, and a stack pointer. Each thread has its own execution stack and is capable of independent input/output. All threads share the virtual memory address space of their task. This allows rapid context switching because threads require little or no memory management.

tile

A logical grouping of one of each kind of VM used within SPECvirt. For SPECvirt_sc2010, a tile consists of one Web Server, Mail Server, Application Server, Database Server, Infrastructure Server, and Idle Server. A valid SPECvirt_sc2010 benchmark result is achieved by correctly executing the benchmark workloads on one or more tiles.

vectorizable

The property of a computer program, or program segment, that allows for the simultaneous execution of operations on different data values; thus making it possible to allocate the work to a set of operators and accomplish the work in parallel. One example of work that is very vectorizable is taking an entire matrix of values and multiplying each by 2, it is possible for different operators to work on different cells of the matrix at the same time. One example of work that is not vectorizable is adding to each item in an array the value of the preceding item in the array, each calculation is dependent upon the results of the preceding calculation so there is no way to perform the operations at the same time.

Vectorization is only one subclass (probably one of the most restrictive subclass) of parallelizable programming.

virtual machine (VM)

An abstracted execution environment which presents an operating system (either discrete or virtualized). The abstracted execution environment presents the appearance of a dedicated computer with CPU, memory and I/O resources available to the operating system. In SPECvirt, a VM consists of a single OS and the application software stack that supports a single SPECvirt_sc2010 component workload. There are several methods of implementing a VM, including physical partitions, logical partitions and software-managed virtual machines.

warm up

A period of time prior to when the actual measurement is taken, where the workload has been already started in an effort to get the SUT to a stable and consistent state.

workload

The workload is the definition of the units of work that are to be performed during a benchmark run.