Benchmark Overview¶

1) What is SPEC Cloud IaaS 2016 benchmark?¶

The SPEC Cloud® IaaS 2016 Benchmark is an industry standard benchmark to measure the performance of infrastructure-as-a-service (IaaS) cloud implementations. It comprises a benchmark testing tool, workloads, run rules, configuration requirements, test procedures, data collection, validation, metric definition, reporting requirements, peer review, all determined by multi-vendor consensus to give representative, comparable, vendor neutral, accurate, and reproducible results. It is designed to stress and measure the provisioning as well as runtime aspects of an IaaS cloud.

2) Who is the intended audience for the benchmark?¶

The benchmark suite is targeted for use by cloud providers, cloud consumers, hardware vendors, virtualization software vendors, application software vendors, and academic researchers.

3) What are the primary objectives and goals of the SPEC Cloud IaaS 2016 Benchmark?¶

The primary goal is to provide metrics that not only quantify the relative performance and capacities of a cloud, but also how typical cloud application workloads behave as the underlying cloud resources are stretched and approach full capacity. The primary goals and objectives of the benchmark are:

Stress the provisioning, compute, storage, and network of a cloud with multiple multi-instance workloads, subject to strict Quality of Service (QoS) metrics.
Support easy addition of future workloads.
Place limited requirements on the internal architecture of the cloud.
Place no requirements for an instance configuration. A cloud provider should be able use physical machine, virtual machine, or a container as an instance type. A cloud provider should be free to choose CPU (virtual CPU or core pinning), memory, disk (ephemeral disk or block storage), and network configuration for an instance type.
Do not require a hypervisor or virtualization layer.
Use workloads that resemble those that typically run in Cloud such as social media applications and big data analytics.
Support multi-tenancy.

4) Can SPEC Cloud IaaS 2016 Benchmark be used for PaaS or SaaS environments?¶

It does not cover Platform as a Service (PaaS) nor does it cover Software as a Service (SaaS) performance measurements.

5) Where can I find the glossary of terms used in the benchmark?¶

Glossary page

6) What are the commonly used terms in this FAQ?¶

The following terms are commonly used in this FAQ and are defined on the glossary page:

Instance (can be any of, but not limited to, a virtual machine, container, or bare metal server)
Application instance (a collection of instances)
White box cloud (defined in our user guide)
Black box cloud (defined in our user guide)
Baseline phase
Elasticity+Scalability phase
CBTOOL

7) What is the system under test in SPEC Cloud IaaS 2016 benchmark?¶

The SUT includes one or more cloud services under test. This includes all hardware, network, base software and management systems used for the cloud service. It does not include any client(s) or driver(s) necessary to generate the cloud workload, nor the network connections between the driver(s) and SUT. The actual set of SUT constituent pieces differs based on the relationship between the SUT and the tester.

8) Why should one use SPEC Cloud IaaS 2016 Benchmark as opposed to other benchmarks used to evaluate cloud performance?¶

SPEC Cloud IaaS 2016 is designed to measure the performance of an IaaS cloud running real workloads.

The benchmark was developed by a consortium of companies that have one or more IaaS cloud offerings.
The SPEC Cloud IaaS 2016 Benchmark measures scalability, elasticity, and provisioning performance for public and private clouds.
The benchmark uses real workloads.
The benchmark is executed using run rules specified in the Run Rules document.
The results get peer reviewed before they are published.
The benchmark defines strict guidelines for reproducibility of results.

9) What are the limitations of SPEC Cloud IaaS 2016 Benchmark?¶

SPEC Cloud IaaS 2016 Benchmark is a benchmark for infrastructure-as-a-service clouds. It does not measure the performance of platform-as-a-service clouds or software-as-a-service clouds.
The benchmark does not specifically interpret CPU, memory, network or storage performance of an instance (although it is recorded by our tooling). The performance of these components are indirectly measured through YCSB and K-Means workloads that utilize Apache Cassandra and Apache Hadoop, respectively. A cloud provider is free to choose instance configuration.
Client-server workloads (REST and HTTP) are an important class of workloads that run in cloud environments. For these types of workloads, the clients typically run outside the cloud. Running such external workload generators was outside the scope of this benchmark release.

10) Can I use SPEC Cloud IaaS 2016 Benchmark to determine the size of the cloud I need?¶

Yes.

11) Does SPEC Cloud IaaS 2016 Benchmark have a power measurement component associated with the benchmark?¶

No, not at this time.

12) Can I compare SPEC Cloud IaaS 2016 Benchmark results for private (white box) and public (black box) clouds?¶

The benchmark measures the scalability, elasticity, and provisioning time. The elasticity and provisioning time are directly comparable.

13) Which cloud platforms are supported by SPEC Cloud IaaS 2016 Benchmark?¶

This benchmark has been tested on the following cloud platforms:

Amazon EC2 (public)
Digital Ocean (public)
Google cloud (public)
IBM Softlayer
Intel Lab OpenStack Environment (private)
Openstack (distro juno/kilo) (private)
Private Cloud offerings by Dell, IBM (x86_64, ppc)
VMware

14) What skills do I need to run SPEC Cloud IaaS 2016 Benchmark?¶

Basic Linux administration skills.
Cloud admin skills for installing and configuring whitebox.
Cloud usage skills for blackbox.
Ability to install software and debug problems in distributed systems.
Familiarity with Python.
Nice to have. Familiarity with Hadoop or Cassandra.

16) A public cloud may comprise thousands of machines. A tester paying for usage of these resources within the cloud may incur a huge bill while running the benchmark.¶

In theory, a cloud is infinitely scalable. The benchmark is designed so that it can measure the infinite scale of a cloud. If no issues occur during infinite scale, the benchmark can be made to run forever.

However, in reality, the cloud is partitioned into data centers, and zones. So a cloud provider, public or private, does have an upper limit in terms of hardware and network resources.

If a third party is running SPEC Cloud IaaS 2016 on a public cloud, obviously they are limited by the monetary resources they can spend on cloud, that is (typically) the number of instances that can be created. In SPEC Cloud IaaS 2016, the maximum number of application instances that can be created can be specified. This helps manage the monetary resources a tester may spend while benchmarking the cloud.

If a public cloud provider were to test their cloud themselves, they are obviously not limited by how much they have to spend.

Relationship to Other SPEC Benchmarks¶

17) I am interested in the CPU performance of a single instance. Shall I use SPEC Cloud IaaS 2016?¶

SPEC Cloud IaaS 2016 stresses control and data plane of a cloud by creating multiple instances (minimum 26 instances - 4 application instances). The benchmark cannot be used to measure the performance of a single instance.

However, when hundreds or thousands of instances exist in your cloud, and are running customer workloads, is the performance of that instance going to be the same?

Cloud, by definition, is elastic and scalable (across CPU, memory, disk, network, control plane, data plane, etc). The SPEC Cloud benchmark is designed to measure the scalability and elasticity of an IaaS cloud by running workloads that span multiple instances (application instances - AI), and that stress CPU, memory, disk, and network of instances and cloud.

Metrics¶

19) What does SPEC Cloud IaaS 2016 Benchmark measure?¶

The benchmark reports three primary metrics, namely, ‘Scalability’,’Elasticity’, ‘Mean Instance Provisioning time’. The primary metrics are described below.

Scalability measures the total amount of work performed by application instances running in a cloud relative to a reference platform.

The total work performed by the benchmark is an aggregate of key workloads metrics across all application instances running in a cloud normalized by workload metrics in a reference cloud platform. The reference platform metrics are an average of workload metrics measured across multiple cloud platforms during the benchmark development. The application instances are launched according to a probability distribution, and gradually increase the load on the cloud. Each application instance comprises YCSB/Cassandra or KMeans/Hadoop workload.
Elasticity measures whether the work performed by application instances scales linearly in a cloud. That is, for statistically similar work, the performance of N application instances in a cloud must be the same as the performance of application instances during baseline phase when no other load is introduced by the tester. Elasticity is expressed as a percentage (out of 100). The higher, the better. Elasticity is self-referential.
Mean Instance Provisioning time measures the average time from the initial request to getting ready to accept ssh connections.

20) How can SPEC Cloud scores of vendors be compared? Can you give examples?¶

a) Vendor A had a score of 10.5 @ 5 AIs while vendor B had a score of 9 @ 6 AIs and both had the same elasticity of 90%. Both had mean instance provisioning time of 100s.

Clearly, vendor A achieved higher scalability score and can publicize as such.

b) Vendor A had a score of 10.5 @ 5AIs while vendor B had a score of 9 @ 6AIs. Vendor A’s elasticity is 85% while vendor B’s elasticity is 90%. Both had mean instance provisioning time of 100s.

Vendor B was able to do more work (6 AIs vs 5 AIs) while maintaining better elasticity (consistent performance). So while vendor A may publicize a higher scalability score, vendor B will claim consistency in performance due to higher elasticity score.

c) Vendor A had a score of 10.5 @ 5AIs, vendor B had a score of 9 @ 6AIs, and vendor C had a score of 6 @ 4 AIs. Vendor A’s elasticity is 85%, vendor B’s elasticity is 90%, vendor C’s elasticity is 80%. vendor A and B had mean instance provisioning time of 100s, while vendor C had a mean instance provisioning time of 50s.

Vendor C had a lower score of scalability and elasticity than vendor A and B, but its mean instance provisioning time was better than vendor vendor A and B. If provisioning speed is important to a customer, but does not care about running large workloads, they may be inclined to dig deeper into vendor C’ offering.

d) Vendor A has a scalability score of 10 @ 5 AIs and vendor B has a score of 30 @ 20 AIs.

Vendor A can say that its application instances do more work than vendor B’s although vendor B achieves a higher scalability score. On the other hand, vendor B can say it achieves the best scale. For a customer whom high scale is important, they may be interested in knowing more about vendor B’s offering.

Either vendor A or B can be a whitebox or a blackbox cloud.

Workloads Used¶

21) Which workloads are used in the SPEC Cloud IaaS 2016 Benchmark?¶

SPEC has identified multiple workload classifications already used in current cloud computing services. From this list, SPEC has selected I/O and CPU intensive workloads for the initial benchmark. Within the wide range of I/O and CPU intensive workloads, SPEC selected a social media, NoSQL database transaction workload (YCSB/Cassandra) and K-Means clustering workload using Hadoop (Hibench/KMeans).

22) How are these workloads used in the SPEC Cloud IaaS 2016 Benchmark?¶

Each workload runs in multiple instances, referred to as an application instance. The benchmark instantiates a single application instance during the baseline phase and creates multiple application instances during the elasticity + scalability phase according to a a uniform probability distribution. These application instances and the load they generate stress the provisioning as well as run-time aspects of a cloud. The run-time aspects include disk and network I/O, CPU and memory of the instances running in a cloud.

Any direct comparison with the open source workloads is not recommended as SPEC Cloud IaaS 2016 Benchmark runs them in particular configurations.

23) Does CBTOOL, the harness for SPEC Cloud IaaS 2016 Benchmark support other workloads?¶

CBTOOL supports over 20 workloads. It allows easy addition of new workloads. The supported workloads are available as part of the SPEC kit and at the CBTOOL source code online (https://github.com/ibmcb/cbtool/tree/master/scripts).

24) Can I use other workloads for testing my cloud?¶

Yes. You can use one of the supported workloads in CBTOOL or add your own workload to run the SPEC Cloud IaaS 2016 Benchmark. However, for a compliant run, the submitted results must be run using the workloads (YCSB/Cassandra KMeans/Hadoop) and their configurations as specified in the Run Rules document.

Benchmark Kit¶

25) How can I obtain the SPEC Cloud IaaS 2016 benchmark kit?¶

SPEC Cloud IaaS 2016 is available via web download from the SPEC site at $2000 for new licensees and $500 for academic and eligible non-profit organizations. The order form is at: http://www.spec.org/order.html.

26) What is included with the SPEC Cloud IaaS 2016 Benchmark kit?¶

SPEC Cloud IaaS 2016 Benchmark comprises the test harness and workloads for the benchmark. This release includes the benchmark harness (Cloudbench, baseline and elasticity drivers, and relevant configuration files) along with the YCSB and Hibench/KMeans workloads, Cassandra and Hadoop source code and operating system packages, as well as scripts to produce benchmark reports and submission files.

The benchmark kit includes example scripts to facilitate testing and data collection.

The kit also includes relevant documentation, that is User guide, Run and Reporting Rules document and the Design document. The documents may be updated from time-to-time; the latest copy is available on the SPEC website.

27) Where can I find a user guide to run the benchmark?¶

User Guide

28) What if I followed the user guide and have questions running SPEC Cloud IaaS 2016 Benchmark?¶

If your issue is not resolved in these documents, please send email to cloudiaas2016support@spec.org. For CBTOOL-specific questions, please send email to: https://groups.google.com/forum/#!forum/cbtool-users

29) Where are the run and reporting rules for SPEC Cloud IaaS 2016 Benchmark?¶

The SPEC Cloud IaaS 2016 Benchmark run rules are available online and also shipped as part of the benchmark kit. The run rules may be updated from time to time. The latest version is available on the SPEC website.

Testing, Compliant Run, Results Submission, Review and Result Announcement¶

30) How can I submit SPEC Cloud IaaS 2016 Benchmark results?¶

Create a zip file containing the result submission .txt file (under perf directory) and the cloud architecture diagram in PNG format.
Email that zip file to subcloudiaas2016@spec.org.
To upload supporting documents, wait until you receive your confirmation that your result has been received so you have the result number to attach the supporting docs to. Create a package (.tgz, .zip, whichever format you have standardized on) with the documents and give it the name of your associated result, e.g. cloudiaas2016-20160107-00016.tgz.
Upload the file using the FTP information found here: https://pro.spec.org/private/osg/cloud/ftpuser.txt

31) When and where are SPEC Cloud IaaS 2016 Benchmark results available?¶

Initial SPEC Cloud IaaS 2016 results are available on SPEC’s web site. Subsequent results are posted on an ongoing basis following each two-week review cycle: results submitted by the two-week deadline are reviewed by SPECcloud committee members for conformance to the run rules, and if accepted at the end of that period are then publicly released. Results disclosures are at: http://www.spec.org/cloud_iaas2016/results.

32) Can I report results for newer versions of Apache Hadoop and Cassandra?¶

Yes. Please follow the rules for open source applications specified in Section 3.3.3 of Run and Reporting Rules document.

33) Are the results independently audited?¶

No. There is no designated independent auditor but the results have to undergo a peer review before they are published by SPEC.

34) Can I announce my results before they are reviewed by the SPEC OsgCloud subcommittee?¶

No.

35) Are results sensitive to components outside of the SUT e.g. client driver machines?¶

No.

36) What is a “compliant” result of SPEC Cloud IaaS 2016 Benchmark?¶

A compliant run is a test that follows the SPEC IaaS 2016 benchmark run and reporting rules.

37) Can I run other workload levels?¶

For testing, yes. For instance, YCSB can be run with 10 million operations, and 10 million records. But for a compliant submission, the workloads should be run with the parameters as described in the run rules document.

38) How long does it take to run the benchmark?¶

The approximate duration for running the benchmark is as follows

Baseline phase for both KMeans and YCSB, 1-2 hours
Elasticity, e.g., for 15 AIs, 2 hours.

NOTE: This assumes your cloud can provisioning all VMs in parallel. Not all clouds are like that, and if so, the two phases can grow.

39) How long does it take to setup the benchmark?¶

It takes approximately 2-3 hours to prepare the benchmark images, assuming you follow the instructions in the user guide.

If the cloud under test is not supported by the benchmark, then an adapter must be developed. The instructions for adding an adapter are in the user guide. The adapter must be reviewed by the subcommittee prior to a submission.

40) Can I declare my SPEC submission as white box without having full admin access and visibility into the hypervisors and cloud management software?¶

No.

41) Can I declare my SPEC submission as black box without it being generally available to other users?¶

No.

42) Can I declare my SPEC submission as both black box and private cloud?¶

Yes, only if your cloud is generally available other users other than yourself.

43) If I declare my SPEC submission as black box am I allowed to make hardware customizations or service prioritization?¶

Only if those customizations or prioritizations are available to other users and not only to yourself.

44) If I make hardware customizations or service prioritization in my cloud can I declare my SPEC submission as white box?¶

Yes, only if you have full admin and visibility into the hardware and can configure the benchmark to provide the appropriate supporting evidence as described by the SPEC documentation, ensuring to SPEC that no other instances or workloads are actively provisioned in your cloud. If you do not have this level of control (which we defined as “complete” control in the documentation), then you must redefine your cloud as private, black box and remove those prioritizations or customizations and make the cloud generally available to other users. If you cannot perform either of those things, then your submission is not valid. If you submit anyway without disclosing those customizations or prioritizations and then your submission is later challenged by another submission in such a way that it cannot be reproduced, then SPEC has the right to retroactively invalidate your submission.

Navigation