Benchmark Overview

1) What is SPEC Cloud IaaS 2018 benchmark?

The SPEC Cloud® IaaS 2018 benchmark measures the performance of infrastructure-as-a-service (IaaS) cloud implementations. It supports testing of both public and private clouds.

The benchmark stresses provisioning as well as runtime aspects of a cloud using I/O and CPU intensive cloud computing workloads. SPEC selected the social media NoSQL database transaction and K-Means clustering using map/reduce as two significant and representative workload types within cloud computing. The first workload uses the Yahoo Cloud Serving benchmark with the NoSQL Cassandra database. The second workload uses KMeans from Intel Hibench Suite for Hadoop.

Each workload runs as a distributed application made up of 6 or 7 instances, referred to as an application instance(AI). The benchmark provisions multiple application instances during a run. Adding AIs to the cloud stress the cloud’s available resources (e.g., CPU, memory, disk, and network).

The workloads continue to run until the test fails the quality of service (QoS) requirements. The tester can also limit the maximum number of application instances created during a run.

For additional details on the benchmark design, how to set up and run the test, and how to create compliant results for publication, please review the Design Overview, User Guide, and Run and Reporting Rules in addition to this FAQ.

2) Who is the intended audience for the benchmark?

The benchmark suite is targeted for use by cloud providers, cloud consumers, hardware vendors, virtualization software vendors, application software vendors, and academic researchers.

3) What are the primary objectives and goals of the SPEC Cloud IaaS 2018 Benchmark?

The primary goal is to provide metrics that quantify not only the relative performance and capacities of a cloud but also how typical cloud application workloads behave as the underlying cloud resources are stretched and approach full capacity. The primary goals and objectives of the benchmark are:

  • Stress the provisioning, compute, storage, and network of a cloud with multiple multi-instance workloads, subject to strict Quality of Service (QoS) requirements.
  • Place limited requirements on the internal architecture of the cloud (see Section 2.1 Benchmark Environment in Run and Reporting Rules).
  • Place no requirements for an instance configuration. A cloud provider may utilize a physical machine, a virtual machine, or a container as an instance type. A cloud provider is free to choose CPU (virtual CPU or core pinning), memory, disk (ephemeral disk or block storage), and network configuration for an instance type.
  • Do not require a hypervisor or virtualization layer.
  • Use workloads that resemble those that typically run in Cloud such as social media applications and big data analytics.
  • Support multi-tenancy.

4) Compared to SPEC Cloud IaaS 2016 what’s new in SPEC Cloud IaaS 2018?

SPEC Cloud IaaS 2018 builds on the 2016 release with a variety of enhancements and new primary metrics. Below are the lists of the significant changes. For more detail see the SPEC Cloud IaaS 2018 Design Overview and User Guide. Please note that while the underlying workloads (KMeans and YCSB) were retained, the results of SPEC Cloud IaaS 2018 are not comparable to SPEC Cloud IaaS 2016 due to the methodology changes to baseline and metric calculations.

  1. Benchmark, Workloads, and Metrics
    • New versions of Hadoop and Cassandra for KMeans and YCSB respectively.
    • Updated YCSB workload parameters to increase I/O load for current storage technologies.
    • Revised methodology for the baseline phase.
    • New naming and calculations for primary metrics.
    • New reference scores.
    • Revised full disclosure report.
  2. CBTOOL
    • Improvements to make it simpler to install CBTOOL and set up the initial workload images.
    • Improved support for public clouds (Amazon Elastic Compute Cloud, Digital Ocean, IBM Cloud (SoftLayer), Google Compute Engine, Azure Service Management).
    • Improved support for more recent versions of OpenStack.
    • Improved support for container and container managers (e.g., Docker/Swarm, LXD, Kubernetes).
  3. Rules and Documentation
    • Updated all documentation for SPEC Cloud IaaS 2018.
    • SPEC Cloud IaaS 2018 metrics not comparable to 2016.
  4. Other changes
    • SPEC Cloud IaaS 2016 is being retired. The final submission date is Dec. 26, 2018.

5) What does SPEC Cloud IaaS 2018 Benchmark measure?

The benchmark reports four primary metrics:

  • Replicated Application Instances reports the total number of valid AIs that have completed at least one application iteration at the point the test ends. The total copies reported is the sum of the Valid AIs for each workload (KMeans and YCSB) where the number of Valid AIs for either workload cannot exceed 60% of the total. The other primary metrics are calculated based on conditions when this number of valid AIs is achieved.
  • Performance Score aggregates the workload scores for all valid AIs to represent the total work done at the reported number of Replicated Application Instances. It is the sum of the KMeans and YCSB workload performance scores normalized using the reference platform. The reference platform values used are a composite of baseline metrics from several different white-box and black-box clouds. Since the Performance Score is normalized, it is a unit-less metric.
  • Relative Scalability measures whether the work performed by application instances scales linearly in a cloud. In a perfect cloud, when multiple AIs run concurrently each AI offer nearly the same level of performance as that measured for an AI running similar work during the baseline phase when the tester introduces no other load. Relative Scalability is expressed as a percentage (out of 100).
  • Mean Instance Provisioning Time averages the provisioning time for instances from all valid application instances. Each instance provisioning time measurement is the time from the initial instance provisioning request to connectivity on port 22 (ssh).

6) Where can I find the glossary of terms used in the benchmark?

The SPEC Cloud IaaS 2018 Glossary compiles the names and terms used in the benchmark and its documentation. Please see:

Glossary page

7) What is the system under test (SUT) in SPEC Cloud IaaS 2018 benchmark?

The SUT is the cloud environment being tested. This includes all hardware, network, base software, and management systems used for the cloud service. It does not include any client(s) or driver(s) necessary to generate the cloud workload, nor the network connections between the driver(s) and SUT. The actual set of SUT constituent pieces differs based on whether it is a white-box or black-box cloud.

8) Why should one use SPEC Cloud IaaS 2018 Benchmark as opposed to other benchmarks used to evaluate cloud performance?

SPEC Cloud IaaS 2018 is designed to measure the performance of an IaaS cloud running real workloads. Developed by members of the SPEC consortium of companies that have one or more IaaS cloud offerings, the benchmark brings the standards that SPEC benchmarks are known for to cloud environments. The SPEC Cloud IaaS 2018 benchmark advantages include:

  • Uses real distributed applications: Hadoop running KMeans and Cassandra running YCSB social media workload.
  • Measures performance characteristics of a scale-out workload that adds load by adding applications instances, modeling a typical workload for many IaaS clouds.
  • Reports primary metrics that cover overall performance of cloud at peak load, relative scalability, as well as provisioning performance.
  • Reports QoS stats and detailed breakdowns on throughputs, completion times, and latencies based on the workload type.
  • Executed based on run rules specified in the Run Rules document.
  • Results are peer reviewed before they are published.
  • Requires adherence to strict guidelines for reproducibility of results.

9) What are the limitations of SPEC Cloud IaaS 2018 Benchmark?

  1. SPEC Cloud IaaS 2018 Benchmark is a benchmark for infrastructure-as-a-service clouds. It does not measure the performance of platform-as-a-service (PaaS) clouds or software-as-a-service (SaaS) clouds.
  2. The benchmark does not support scaling up AIs by adding new instances (e.g. seeds, data nodes) to existing AIs or by adding CPU, Memory, or Storage resources to individual instances within AIs. The number of instances per AI type for a compliant test is defined by the benchmark. The resources assigned to the instance images used by the AI are set by the tester prior to starting the test.
  3. The benchmark does not specifically interpret CPU, memory, network or storage performance of an instance (although it is recorded by our tooling). The performance of these components is indirectly measured through YCSB and KMeans workloads that utilize Apache Cassandra and Apache Hadoop, respectively. A cloud provider is free to choose instance configuration.
  4. Client-server workloads (REST and HTTP) are a class of workloads that run in cloud environments. For these types of workloads, the clients typically run outside the cloud. Running such external workload generators was outside the scope of this benchmark release.

10) Can I use SPEC Cloud IaaS 2018 Benchmark to determine the size of the cloud I need?

Yes, if you understand the characteristics of the workloads you plan to use in your cloud, you may be able to relate them to the benchmark’s workloads and metrics.

11) Does SPEC Cloud IaaS 2018 Benchmark have a power measurement component associated with the benchmark?

No, not at this time.

12) Can I compare SPEC Cloud IaaS 2018 Benchmark results for private (white-box) and public (black-box) clouds?

Yes, the benchmark’s metrics are considered comparable between private (white-box) and public (black-box) clouds.

13) Which cloud platforms are supported by SPEC Cloud IaaS 2018 Benchmark?

The following cloud platforms have CBTOOL adapters that work with this benchmark having been tested by members of the SPEC Cloud subcommittee. Additional cloud platforms can be supported once a CBTOOL adapter has been written and submitted to the subcommittee’s review process.

  • Amazon EC2 (public)
  • Digital Ocean (public)
  • Google cloud (public)
  • IBM SoftLayer
  • Intel Lab OpenStack Environment (private)
  • OpenStack (distro juno/kilo) (private)
  • Private Cloud offerings by Dell, IBM (x86_64, ppc)
  • VMware Integrated OpenStack (VIO)

14) What skills do I need to run SPEC Cloud IaaS 2018 Benchmark?

  • Basic Linux administration skills.
  • Cloud admin skills for installing and configuring white-box.
  • Cloud usage skills for black-box.
  • Ability to install software and debug problems in distributed systems.
  • Familiarity with Python.
  • Nice to have: familiarity with Hadoop or Cassandra.

16) A public cloud may have thousands of machines. Won’t a tester paying for usage of these resources incur a huge bill to run the benchmark?

In theory, a perfect cloud is infinitely scalable. The benchmark is designed so that it could measure the infinite scale of a cloud. If no issues occur during infinite scale-out such that QoS limits are never exceeded, the benchmark would continue to run forever and the tester would be billed for an ever-increasing amount of cloud resources.

In reality, the cloud gets partitioned into data centers and zones. So a cloud provider, public or private, does have upper limits on hardware and network resources.

If the tester with a limited budget for cloud resources runs SPEC Cloud IaaS 2018 on a public cloud, the tester needs to limit the number of instances created to stay within budget.

The tester can set a maximum for the number of application instances created. This parameter controls the total number of instances needed (roughly 6.5 * #AIs) and allows the tester to estimate the duration of the test and the cost of running the benchmark in the cloud.

We recommended that when testing in a public cloud, the tester begins with the maximum set to a few AIs, so the tester can gain experience with benchmark runs that complete within several hours before embarking on tests that could run for days, create hundreds of instances, and generate significant cloud usage charges.

Relationship to Other SPEC Benchmarks

17) I am interested in the CPU performance of a single instance. Shall I use SPEC Cloud IaaS 2018?

No, SPEC Cloud IaaS 2018 stresses control and data plane of a cloud by creating multiple instances (minimum 26 instances - 4 application instances). The benchmark cannot be used to measure the performance of a single instance.

When hundreds or thousands of instances exist in your cloud and are running customer workloads, the performance of any single instance or workload in the cloud may vary as more resources are consumed.

Cloud, by definition, is scalable across CPU, memory, disk, network, control plane, and data plane. The SPEC Cloud benchmark measures the performance and scalability of an IaaS cloud by running workloads that span multiple instances (application instances - AI), and that stress CPU, memory, disk, and networking resources in the cloud.

20) How can SPEC Cloud scores of vendors be compared? Can you give examples?

Example 1

Metrics Vendor A Vendor B Vendor C Vendor D
Replicated AIs 5 6    
Performance Score 10.5 9.0    
Relative Scalability 90% 90%    
Mean Instance Prov. Time 100 secs 100 secs    

Vendor A achieved higher Performance Score so can publicize this over Vendor B.

Example 2

Metrics Vendor A Vendor B Vendor C Vendor D
Replicated AIs 5 6    
Performance Score 10.5 9.0    
Relative Scalability 80% 90%    
Mean Instance Prov. Time 100 secs 100 secs    

Vendor B was able to do more work (6 AIs vs. 5 AIs) while maintaining better relative scalability (consistent performance). So while Vendor A may publicize a higher Performance Score, Vendor B can claim consistency in performance due to higher Relative Scalability.

Example 3

Metrics Vendor A Vendor B Vendor C Vendor D
Replicated AIs 5 6 4  
Performance Score 10.5 9.0 6.0  
Relative Scalability 80% 90% 75%  
Mean Instance Prov. Time 100 secs 100 secs 50 secs  

Vendor C had a lower Performance Score and Relative Scalability than the other two, but its mean instance provisioning time was better than either A or B. If provisioning speed is essential to a customer and they do not require the larger workloads, they may be inclined to dig deeper into Vendor C’s offering.

Example 4

Metrics Vendor A Vendor B Vendor C Vendor D
Replicated AIs 5 6 4 20
Performance Score 10.5 9.0 6.0 30.0
Relative Scalability 80% 90% 75% 90%
Mean Instance Prov. Time 100 secs 100 secs 50 secs 100 secs

Vendor A can say that its application instances do more work than Vendor D’s based on the ratio of the performance score to the number of AIs (10.5/5 vs 30/20), although Vendor D achieves a higher scalability score. On the other hand, Vendor D can say it achieves the best scale. For a customer whom high scale is essential, they may be interested in knowing more about Vendor D’s offering.

Workloads Used

21) Which workloads are used in the SPEC Cloud IaaS 2018 Benchmark?

SPEC has identified multiple workload classifications already used in current cloud computing services. From this list, SPEC has selected I/O and CPU intensive workloads for the initial benchmark. Within the wide range of I/O and CPU intensive workloads, SPEC selected a social media, NoSQL database transaction workload (YCSB/Cassandra) and K-Means clustering workload using Hadoop (HiBench/KMeans).

22) How are these workloads used in the SPEC Cloud IaaS 2018 Benchmark?

Each workload runs in multiple instances, referred to as an application instance. The benchmark instantiates a single application instance during the baseline phase and creates multiple application instances during the scale-out phase according to a uniform probability distribution. These application instances and the load they generate stress the provisioning as well as run-time aspects of a cloud. The run-time aspects include disk and network I/O, CPU, and memory of the instances running in a cloud.

Note that the benchmark results aren’t comparable to the original open source due to the differences in configuration.

23) Does CBTOOL, the harness for SPEC Cloud IaaS 2018 Benchmark support other workloads?

CBTOOL supports over 20 workloads; the SPEC kit includes the version CBTOOL available at the time the benchmark is released. To obtain the most current version of CBTOOL source code online and its supported workloads see https://github.com/ibmcb/cbtool/tree/master/scripts. CBTOOL also supports easy addition of new workloads.

24) Can I use other workloads for testing my cloud?

Yes. You can use the CBTOOL that’s included in the kit to run any of the workloads CBTOOL supports or use CBTOOL to run your own workloads. If you want to use other workloads, you may want to use the CBTOOL source from GitHub, so you’ll have the latest set of CBTOOL supported workloads (see link above). For SPEC compliant runs and submissions, you must use the workloads YCSB/Cassandra and KMeans/Hadoop and their configurations as specified in the Run Rules document.

Benchmark Kit

25) How can I obtain the SPEC Cloud IaaS 2018 benchmark kit?

SPEC Cloud IaaS 2018 is available via web download from the SPEC site at $2000 for new licensees and $500 for academic and eligible non-profit organizations. The order form is at: http://www.spec.org/order.html.

26) What is included with the SPEC Cloud IaaS 2018 Benchmark kit?

SPEC Cloud IaaS 2018 Benchmark comprises the test harness and workloads for the benchmark. This release includes the benchmark harness (CBTOOL, baseline and scale-out drivers, and relevant configuration files) along with the YCSB and HiBench/KMeans workloads, Cassandra and Hadoop source code and operating system packages, as well as scripts to produce benchmark reports and submission files.

The benchmark kit includes example scripts to facilitate testing and data collection.

The kit also includes relevant documentation, that is User guide, Run and Reporting Rules document and the Design document. The documents may require updates from time-to-time; the latest copy is available on the SPEC website.

27) Where can I find a user guide to run the benchmark?

User Guide

28) What if I followed the user guide and have questions running SPEC Cloud IaaS 2018 Benchmark?

If your issue is not resolved in these documents, please send email to cloudiaas2018support@spec.org. For CBTOOL-specific questions, please send email to: https://groups.google.com/forum/#!forum/cbtool-users

29) Where are the run and reporting rules for SPEC Cloud IaaS 2018 Benchmark?

The SPEC Cloud IaaS 2018 Benchmark run rules are available online and also shipped as part of the benchmark kit. The run rules may be updated from time to time. The latest version is available on the SPEC website.

Testing, Compliant Run, Results Submission, Review and Result Announcement

30) How can I submit SPEC Cloud IaaS 2018 Benchmark results?

  • Create a zip file containing the result submission .txt file (under perf directory) and the cloud architecture diagram in PNG format.
  • Email that .zip file to subcloudiaas2018@spec.org.
  • To upload supporting documents, wait until you receive your confirmation that your result was received. Create an archive package (e.g., .tgz, .zip) with the documents and give it the name of your associated result as provided in your confirmation, e.g., cloudiaas2018-20181107-00016.tgz.
  • Upload the file using the FTP information found here: https://pro.spec.org/private/osg/cloud/ftpuser.txt

31) When and where are SPEC Cloud IaaS 2018 Benchmark results available?

Initial SPEC Cloud IaaS 2018 results are available on SPEC’s web site. Subsequent results are posted on an ongoing basis following each two-week review cycle: results submitted by the two-week deadline are reviewed by SPEC Cloud subcommittee members for conformance to the run rules, and if accepted at the end of that period are then publicly released. Results disclosures are at: http://www.spec.org/cloud_iaas2018/results.

32) Can I report results for newer versions of Apache Hadoop and Cassandra?

Yes. Please follow the rules for open source applications specified in Section 3.3.3 of Run and Reporting Rules document.

33) Are the results independently audited?

No. There is no designated independent auditor, but the results have to undergo a peer review before they are published by SPEC.

34) Can I announce my results before they are reviewed by the SPEC Cloud subcommittee?

No.

35) Are results sensitive to components outside of the SUT e.g. client driver machines?

No.

36) What is a “compliant” result of SPEC Cloud IaaS 2018 Benchmark?

A compliant run is a test that follows the SPEC IaaS 2018 benchmark run and reporting rules.

37) Can I run other workload levels?

For experimental testing, yes. For instance, YCSB can be run with 10 million operations, and 10 million records. But for a compliant submission, the workloads must be run with the parameters as described in the run rules document.

38) How long does it take to run the benchmark?

The approximate duration for running the benchmark is as follows.

  • Baseline phase for both KMeans and YCSB may take 2 or more hours.
  • Scale-out phase depends on the number of AIs, for example, 15 AIs may take 2 hours.

NOTE: This assumes your cloud can provision all VMs in parallel. Not all clouds are like that, and if so, the two phases can take longer.

39) How long does it take to setup the benchmark?

It takes approximately 2-3 hours to prepare the benchmark images, assuming you follow the instructions in the user guide.

If CBTOOL does not support the cloud under test, then an adapter must be developed. The instructions for adding an adapter are in the user guide. The SPEC Cloud subcommittee must review the adapter before any submission review based on it.

40) Can I declare my SPEC submission as white-box without having full administrator access and visibility into the hypervisors and cloud management software?

No.

41) Can I declare my SPEC submission as black-box without it being generally available to other users?

No.

42) Can I declare my SPEC submission as both black-box and private cloud?

Yes, only if your cloud is generally available users other than yourself.

43) If I declare my SPEC submission as black-box am I allowed to make hardware customizations or service prioritization?

Only if those customizations or prioritizations are available to other users and not only to yourself.

44) If I make hardware customizations or service prioritization in my cloud can I declare my SPEC submission as white-box?

Yes, only if you have full admin and visibility into the hardware and can configure the benchmark to provide the appropriate supporting evidence as described by the SPEC documentation. You must ensure the cloud has no other instances or workloads actively provisioned. If you do not have this level of control (which we defined as “complete” control in the documentation), then you must redefine your cloud as private black-box and remove those prioritizations or customizations and make the cloud generally available to other users.

If your environment does not comply with the benchmark’s requirements for either a black-box or white-box cloud, then test results should not be submitted. If SPEC discovers a submission did not comply with the Run and Reporting rules or an attempt to reproduce your results fails, SPEC has the right to invalidate your submission and mark it as non-compliant.