Document Title: faq.txt Subject: FAQ for HPC2002 Last Updated: 21 Nov 2002 2pm wj ------------------------------------------------------------- Contents -------- - SPEC General - Use of SPEC Benchmarks in Research - SPEC HPC2002 Description - Publication of SPEC HPC 2002 Results - SPEC HPC2002 Background and Rationales - SPEC HPC2002 in Comparison with Other Benchmarks - Obtaining More Information SPEC General ------------ Q1. What is SPEC? SPEC is an acronym for the Standard Performance Evaluation Corporation. SPEC is a non-profit organization composed of computer vendors, systems integrators, universities, research organizations, publishers and consultants whose goal is to establish, maintain and endorse a standardized set of relevant benchmarks for computer systems. Although no one set of tests can fully characterize overall system performance, SPEC believes that the user community will benefit from an objective series of tests which can serve as a common reference point. Q2. What is a benchmark? The definition from Webster's II Dictionary states that a benchmark is "A standard of measurement or evaluation." A computer benchmark is typically a computer program that performs a strictly defined set of operations (a workload) and returns some form of result (a metric) describing how the tested computer performed. Computer benchmark metrics usually measure speed (how fast was the workload completed) or throughput (how many workloads per unit time were measured). Running the same computer benchmark on multiple computers allows a comparison to be made. Q3. Why use a benchmark? Ideally, the best comparison test for systems would be your own application with your own workload. Unfortunately, it is often very difficult to get a wide base of reliable, repeatable and comparable measurements for comparisons of different systems on your own application with your own workload. This might be due to time, money, confidentiality, or other constraints. Q4. What options are viable in this case? At this point, you can consider using standardized benchmarks as a reference point. Ideally, a standardized benchmark will be portable and maybe already run on the platforms that you are interested in. However, before you consider the results you need to be sure that you understand the correlation between your application/computing needs and what the benchmark is measuring. Are the workloads similar and do they have the same characteristics? Based on your answers to these questions, you can begin to see how the benchmark may approximate your reality. Note: It is not intended that the SPEC benchmark suites be used as a replacement for the benchmarking of actual customer applications to determine vendor or product selection. Use of SPEC Benchmarks in Research ---------------------------------- Q5: Is the use of SPEC benchmarks encouraged for use in research SPEC encourages the use of their benchmarks for research. Note, that much of the benchmarking documentation is written from the angle of a benchmarker who intends to generate SPEC-publishable results. However, SPEC has defined guidelines for research use of its benchmarks (see http://pro.spec.org/private/hpg/webpages/academic_rules.html) They are intended to be consistent with any guidelines for high-quality scientific work. Q6: I only want to use the SPEC benchmarks for my research. Do I need to understand the "SPEC Philosophy" ? The SPEC benchmarking philosophy deals with fairness in obtaining and reporting computer performance measurements. Many of these ideas apply to scientific performance evaluation as well. Hence, familiarizing yourself with the ideas behind SPEC benchmarks may prove useful for your research and the publication quality of your results. Q7: Are any of the SPEC rules binding for me? The only way you could have legally obtained access to the SPEC benchmarks is by becoming a licensee of SPEC. (Your advisor, or colleague who gave you the codes may be the licensee, and he or she has agreed to abide by SPEC's rules). However, most rules apply to the process of generating SPEC-publishable benchmark results. For research use, SPEC has defined Guidelines (see http://pro.spec.org/private/hpg/webpages/academic_rules.html) Q8: Are any of the benchmark run tools useful for my research? The goal of SPEC's benchmark run tools is to help the benchmarker, enforce SPEC's runrules, and assure the quality of the benchmark reports. Some important aspects of the tools are: o Benchmark code and even suites can be made, run, validated and reports generated with a single command line. o The SPEC-provided makefiles are platform-independent (among the systems whose manufacturers participate in SPEC). System-specific make commands are separated out into a so-called config file. o When making and running a benchmark, the tools copy the source and all relevant files into a completely separate "run" directory, isolating the run from the original source and from other runs. All these facilities can be useful for research projects as well. It may be worth learning about the tools. Q9: I want to use the source code and input files only, I don't want to learn about the run tools. How do I proceed? SPEC HPC benchmarks are full applications, which cannot be made with a simple "f77 *.f" command and executed with "a.out2002 SPECchem2002 SPECenv2002 where indicates the data size: S, M, L, X All metrics are computed from the overall wallclock execution time T of the benchmarks as 86400/T. This can be interpreted as the number of times the benchmark could run consecutively in a day. Note, however that this is *not* a throughput measure. A higher score means "better performance" on the given workload. The performance for the different data sets cannot be compared. They may exercise different execution paths in the benchmarks. Q19. Which SPEC HPC2002 metric should be used to compare performance? It depends on your needs. SPEC provides the benchmarks and results as tools for you to use. You need to determine how you use a computer or what your performance requirements are and then choose the appropriate SPEC benchmark. Q20: How can I obtain SPEC HPC2002? SPEC HPC2002 is available on CD-ROM for $1,200; discounts are available for university and research organizations. For more information send e-mail to info@spec.org. Publication of SPEC HPC 2002 Results ------------------------------------- Q21. Where are SPEC HPC2002 results available? Results for all measurements submitted to SPEC are available at http://www.spec.org/hpg Q22: Can SPEC HPC2002 results be published outside of the SPEC web site? Yes, SPEC HPC2002 results can be freely published if all the Run and Reporting Rules have been followed and reviewed by SPEC/HPG for a nominal fee. The SPEC HPC2002 license agreement binds every purchaser of the suite to the run and reporting rules if results are quoted in public. A full disclosure of the details of a performance measurement must be provided to anyone who asks. See the SPEC HPC2002 Run and Reporting Rules for details. SPEC strongly encourages that results be submitted for the web site, since it ensures a peer review process and uniform presentation of all results. The Run and Reporting Rules contain an exemption clause for research and academic use of SPEC HPC2002. Results obtained in this context need not comply with all the requirements for other measurements. It is required, however, that research and academic results be clearly distinguished from results submitted officially to SPEC. SPEC HPC2002 Background and Rationales -------------------------------------- Q23. Why use SPEC HPC2002? SPEC HPC2002 provides the most realistic and comprehensive benchmarks for measuring a computer system as a whole. The benchmark applications include large, realistic, computational applications. Among all the SPEC suites, HPC2002 has the most flexible runrules, allowing many code optimizations. This reflects computing practices on high-performance systems. It allows the benchmarker to achieve the best application performance. Other advantages to using SPEC HPC2002: - Benchmark programs are developed from actual end-user applications as opposed to being synthetic benchmarks. - Multiple vendors use the suite and support it. - SPEC HPC2002 is portable to many platforms. - A wide range of results are available at http://www.spec.org/hpg - The benchmarks are required to be run and reported according to a set of rules to ensure comparability and repeatability. - HPC2002 allows comparison of OpenMP and MPI parallelization paradigms. Q24: What organizations were involved in developing SPEC HPC2002? SPEC HPC2002 was developed by the Standard Performance Evaluation Corp.'s High-Performance Group (SPEC/HPG), formed in January 1994. Founding partners of SPEC/HPG include SPEC members, former members of the Perfect Benchmarks effort, and other groups working in the benchmarking arena. SPEC/HPG's mission has remained the same: to maintain and endorse a suite of benchmarks that represent real-world, high-performance computing applications. Current members include: Compaq, Fujitsu America, IBM, Sun Microsystems, Tsukuba Advanced Computing Center, and current Associates include: Purdue University, Real World Computing, University of Illinois, University of Minnesota and University of Tennessee. Q25: What criteria were used to select the benchmarks? The benchmark applications were collected with the criterion of obtaining the most realistic, largest computational applications that can be distributed by SPEC. Q26: What are SPEC/HPG's plans for adding applications to SPEC HPC2002? SPEC/HPG is examining additional applications used in other areas of computational analysis running on high-performance computers. Applications under consideration include computational fluid dynamics (CFD), molecular dynamics, climate, ocean and weather codes. The SPEC HPC suite in updated continuously. Contributions are encouraged. Q27: Will SPEC/HPG replace applications in conjunction with changes in industrial software code? Yes. Applications in the SPEC HPC2002 suite will be reviewed on a regular basis, and when newer versions are available, they will be incorporated into the benchmark suite. If an application falls out of use within its industrial area, a new, more relevant application will be adopted to replace it. Q28: Will SPEC HPC2002 include more applications written in C or C++ in the future? If a suitable application representing relevant computational work in industry is written in C or C++, it will certainly be considered. In fact, both applications in SPEC HPC2002 V1.0 contain components written in C. Q29: How do SPEC HPC2002 benchmarks address different parallel architectures, such as clusters, vector systems, SMPs and NUMA? SPEC HPC2002 benchmarks can be executed in serial or parallel mode. Due to the agreed-upon software standards for parallel systems, the parallel implementations have been based on the message-passing programming model MPI, and on the directive-based OpenMP API. Since high-performance computing systems use different architectures, the SPEC HPC2002 run rules allow for some flexibility in adapting the benchmark application to run in parallel mode. To ensure that results are relevant to end-users, SPEC/HPG requires that systems running SPEC HPC2002 benchmarks adhere to the following rules: o they must provide a suitable environment for running typical C and Fortran programs, o the system vendor must offer its implementation for general use o the implementation must be generally available, documented, and supported by the vendor Q30: Are SPEC HPC2002 results comparable for these different parallel architectures? Yes. Most consumers of high-performance systems are interested in running a single important application, or perhaps a small set of critical applications, on these high-priced machines. The amount of time it takes to solve a particular computational analysis is often critical to a high-performance systems user's business. For these consumers, being able to compare different machines' abilities to complete a relevant problem of a specific size for their application is valuable information, regardless of the architectural features of the system itself. Q31: Are SPEC HPC2002 results comparable across workload size? Can you compare serial results to parallel results? Varying the problem size, but not the system or parallelization, demonstrates how the application performs under a greater workload. The definition of "workload" will be application-specific and meaningful to users doing that sort of work. With SPECseis, for example, larger trace files require more I/O, larger FFTs, and longer running times. A seismic analyst will be able to use the benchmark results to understand the ability of a machine to accomplish mission-critical tasks. Different datasets may also exercise different functionality of the codes, which must be considered when interpreting scalability with respect to data size. Comparing serial to parallel results yields significant information as well: It shows the scalability of the test system for a specific benchmark code. Q32: How will SPEC/HPG address the evolution of parallel programming models? As standards emerge for parallel programming models, they will be reflected in the SPEC HPC2002 benchmarks. In response to the growing acceptance of SMP architectures, for example, SPEC/HPG is developing SAS (shared address space) parallel versions of its current benchmarks. Q33: Can SPEC HPC2002 benchmarks be run on a high-end workstation? Yes, they can be run on single-processor machines. The smaller problem sizes are likely to be the most suitable for these systems. Q34: Traditionally, SPEC has not allowed any code changes in its benchmarks. Why are code changes allowed in SPEC HPC2002 and how did SPEC/HPG decide what should be allowed? SPEC/HPG recognizes that customers who will spend many thousands to millions of dollars on a high-performance computer are willing to invest additional money to optimize their production codes. In addition to delivering more return on investment, code changes are required because there are so many different high-performance architectures; moving an application from one architecture to another is far more involved than porting a single CPU code from one workstation to another. SPEC/HPG realized that since all customers optimize their programs, vendors should be allowed to perform the same level of optimization as a typical customer. There are specific rules that vendors must follow in optimizing codes. These rules were chosen to allow each vendor to show what their systems are capable of without allowing large application rewrites that would compromise performance comparisons. Each vendor's code changes must be fully disclosed to the entire SPEC/HPG membership and approved before results are published. These changes must also be included in published reports, so customers know what changes they would have to make to duplicate results. Q35: Do SPEC HPC2002 benchmarks measure speed or throughput? Both. SPEC HPC2002 benchmarks measure the time it takes to run an application on the system being tested -- that's a test of speed. The SPEC HPC2002 metric also normalizes the benchmark's elapsed time to the number of seconds in a day. So, the benchmarks also measure throughput, since the metric reports how many benchmarks could be run, back to back, in a given 24-hour period. Q36: Does SPEC HPC2002 make SPEC CPU2000 obsolete? What does it measure that SPEC CPU2000 does not? SPEC HPC2002 results provide information that supplements SPEC CPU2000 results. Consumers of high-performance computing systems usually run a particular application or set of applications. It is important for these consumers to know how applications in their area of analysis will perform on the systems under consideration. This is the kind of specific information that SPEC HPC2002 provides. Q37: Why doesn't SPEC/HPG define a metric such as M/FLOPS or price/performance? SPEC/HPG chose to focus on total application performance for large, industrially relevant applications. Within this benchmarking environment, a simple metric such as M/FLOPS is inadequate and misleading. Customers need to understand the expected performance of systems under consideration for purchase. Real-world performance includes all of the set-up, computation and post-processing work. Since the pre- and post-processing phases of applications can be significant factors in total system performance, SPEC/HPG chose to concentrate on total system performance. Q38. Why does HPC2002 only have a "peak" but no baseline metric? In contrast to other SPEC benchmark suites, SPEC HPC2002 includes one metric per code and data size only. There are no "base" results that would measure compiler-only performance. The SPEC HPC2002 runrules allow certain hand optimizations for all metrics. Since high-performance computer customers are willing to invest programming time to tune the applications that run on their systems, a baseline result has little meaning to them. Also, the architectures employed in the high-performance computing market are far more diverse than those found in single-CPU workstations. The baseline or "out-of-the-box" performance of any given application has no correlation to the actual performance a customer could expect to achieve on a particular architecture. Q39: Why is there no reference machine for performance comparisons? Reference machines give benchmark users a framework for judging metrics that would otherwise just be meaningless sets of numbers. SPEC HPC2002 uses time as its reference, not the speed of a particular machine. The metric itself tells how many successive benchmark runs can be completed in a 24-hour period on the system being tested. Q40: Why doesn't SPEC HPC2002 provide a composite or aggregate performance metric? Providing a composite or aggregate performance metric would undermine the purpose of SPEC HPC2002. SPEC HPC2002 is designed to inform users about how industrial-strength applications in their fields of analysis will perform. These users are particularly interested in how well their applications will scale as parallelism increases. This is why SPEC HPC2002 reporting pages provide metrics for systems with different numbers of processors running the same application and problem size. SPEC HPC2002 in Comparison with Other Benchmarks ------------------------------------------------ Q41. Some of the benchmark names may sound familiar; are these comparable to other programs? Many of the SPEC benchmarks have been derived from publicly available application programs and all have been developed to be portable to as many current and future hardware platforms as practical. Hardware dependencies have been minimized to avoid unfairly favoring one hardware platform over another. For this reason, the application programs in this distribution should not be used to assess the probable performance of commercially available, tuned versions of the same application. The individual benchmarks in this suite may be similar, but NOT identical to benchmarks or programs with the same name which are available from sources other than SPEC; therefore, it is not valid to compare SPEC HPC2002 benchmark results with anything other than other SPEC HPC2002 benchmark results. (Note: This also means that it is not valid to compare SPEC HPC2002 results to older SPEC benchmarks; these benchmarks have been changed and should be considered different and not comparable.) Q42: What is the difference between SPEC HPC2002 and other benchmarks for high-performance systems? The most important distinction is that SPEC HPC2002 includes applications used in industry and research to do real work. These applications normally run on multiprocessing systems and require the larger computing resources offered by high-end systems. Only minimal modifications were made to the applications used in the SPEC HPC2002 suite. By leaving even "uninteresting" functions of the application code intact, SPEC HPC2002 provides a realistic measure of real-world application performance. SPEC/HPG's methodology differs from previous benchmarking efforts, which concentrated only on more numerically intensive algorithms. A second distinction is that SPEC HPC2002 targets all high-performance computer architectures. The applications in the suite are currently in use on a wide variety of systems, including workstations, clusters, SMPs, vector systems and MPPs. The programming models used in the SPEC HPC2002 application codes -- message-passing, shared-memory parallel, and serial models -- can be run on all of today's high-performance systems. Other benchmarks tend to be biased towards either distributed-memory, shared-memory or vector architectures. Finally, SPEC HPC2002 provides more than just peak performance numbers. To ensure that PEC HPC2002 reflects performance for real applications, only a limited number of optimizations are allowed. This contrasts with benchmarks that allow a large number of optimizations requiring unrealistic development efforts to reproduce benchmark results. It also contrasts with benchmarks that restrict optimizations altogether. Q43: How does SPEC HPC2002 compare to the NAS parallel benchmarks or to Parkbench? The NPB (NAS Parallel Benchmarks) and Parkbench are kernels or subsets of applications; they are used to compare architectural implementations of machines. SPEC HPC2002 benchmarks are complete, real-world applications used by numerous organizations to solve real problems. These new benchmarks allow users to determine how well a given system performs for the entire spectrum of factors needed to solve real-world problems, including numerical computation, I/O, memory access, software systems, and many others. Obtaining More Information -------------------------- Q44. How do I contact SPEC for more information or for technical support? SPEC can be contacted in several ways. For general information, including other means of contacting SPEC, please see SPEC's World Wide Web Site at: http://www.spec.org/ General questions can be emailed to: info@spec.org HPC2002 Technical Support Questions can be sent to: HPC2002support@spec.org Q45. Now that I've read this document, what should I do next? If you have arrived here by starting at the benchmark distribution's readme1st document, you should now verify that your system meets the requirements described in system_requirements.txt and then you can install the suite, following the instructions in install_guide_unix.txt (should work for linux) or install_guide_nt.txt