The lack of a standardized measure for graphics performance has frustrated
both users and vendors of graphics hardware. Potential buyers have trouble
sorting out the performance claims of hardware vendors. On the other hand,
vendors are often faced with developing costly individual benchmarks to
measure the performance of their machines for users' specific applications.
Until recent years, the methods available for measuring performance were ambiguous. The "vectors-per-second" method, for example, leads to questions as to the length of the vector and whether vectors are drawn end-to-end or separately. The "polygons-per-second" measurement suffers from similar disparities: there is no standard-size polygon or any standard as to how polygons are shaded or lit.
The first organized effort to address this issue occurred in late 1986 when major workstation vendors and users met to discuss the problem. This and subsequent meetings led to the Picture-Level Benchmark (PLB) project, which includes Digital Equipment Corp., Hewlett-Packard, IBM, and Sun Microsystems.
The Picture-Level Benchmark (PLB) is a software package that provides an a "apples-to apples" comparison of graphics display performance for different hardware platforms.
Picture-Level Benchmark Reports
About the PLB
The PLB is designed to measure the performance of CRT-based display systems. This includes, but is not limited to, engineering workstations, personal computers, and special-purpose attached display systems.
The only requirements for the PLB to work are that geometry be presented to the system in a specified format and that the PLB program has been ported to the device under test. The PLB includes these major components:
1. The Benchmark Interchange Format (BIF), the file format for
specifying the geometry and actions that will be performed in
a test.
2. The Benchmark Timing Methodology (BTM), which provides a
standardized performance measurement.
3. The Benchmark Reporting Format (BRF), for standardized reporting
of test results.
4. The Picture-Level Benchmark (PLB) program, which implements
BIF file processing and runs the test.
5. A suite of files for testing PLB implementation.
6. A suite of BIF standard benchmark files that are used for graphics
performance tests.
The PLB program itself is platform-dependent. In order to run BIF files, someone (most likely the vendor) has to adapt the PLB code for a specific hardware configuration. The BIF Specification, the standard files, test files, and a PLB sample implementation are available via anonymous ftp.
The most exacting method of performance measurement is for users to convert their applications into BIF files and run them directly on vendors' ports to the PLB program. The PLB subcommittee, however, realizes that many users may not have the time nor technical expertise to do this.
To make PLB testing results available to a wider range of graphics users, the PLB project group develops BIF files based on popular applications. Eight of these files are used to derive the performance numbers reported in this issue.
It is important to note that although the PLB allows buyers to compare performance, it does not address the issue of display quality. This subjective issue is left to the eyes of the beholder.
How Reported Results Were Obtained
The PLB project group has approved eight standard benchmark files for graphics performance testing. These files are separated into three categories; 3-D wireframe, 3-D surface, and "other." Click here to see detailed descriptions of the files. A single figure of merit is reported for the two specific categories. These numbers - PLBwire93 and PLBsurf93 - represent the geometric mean of the PLBlit and PLBopt numbers for the standard benchmark files in the two categories.
Performance numbers were reviewed through the verification procedures established by the GPC project group).
Performance results for each standard benchmark file are reported using two numbers separated by a colon. The number on the left is designated as "PLBlit." It is a literal number aimed at graphics users who want to measure performance for the same graphics entities from one platform to another and who are unable or unwilling to tune their applications to a particular hardware system or graphics interface. The PLBlit number reflects an application file that is run without optimizations. This number might be better suited to some hardware architectures than others; that is why the PLB project group decided on using two numbers to measure performance.
The number on the right is designed as "PLBopt." It is an optimized number for those users who are willing to tune their applications to achieve better performance on specific hardware platforms with specialized graphics interfaces. The PLBopt number shows the best possible performance for a vendor's specialized hardware configuration and application programmer's interface (API).
A typical performance number for a standard benchmark file, such as "sys_chassis," looks like this:
31.1 : 35.2
The left number represents the literal (PLBlit) number, the right the optimized (PLBopt) number. "NP" on the left side of the colon indicates that is not possible for the platform hardware or software to execute the benchmark file's graphics entities under the PLBlit guidelines. In this case, the vendor has chosen not to emulate the functionality in its PLB port. "NA" on the right side of the colon means an optimized number is not available at this time.
If the optimized number, for example, is not reported, the number in the report would look like this:
31.1 : NA
If no number is reported for a particular benchmark file, this is represented by "NR."
When optimizations are made, they are listed at the bottom of the PLB Report page. A PLB Report is published for each hardware configuration for which the vendor submits performance numbers. Click here to see a sample PLB Report page.
Performance numbers are derived from a normalized constant that is divided by the elapsed time in seconds required to perform the test. A high number represents better performance (i.e., fewer seconds to display the picture under test).
Some of the PLB Report pages might include a note regarding conversion time ratio in the "Comments/Notes" section. If a ratio of "2" or more is reported, it indicates that it takes a significant amount of time to generate optimizations used to obtain the PLBopt number the vendor reported for a particular standard benchmark file.
Although PLB project group members agree that no single figure of merit can adequately describe the performance characteristics of graphics systems across all applications, the PLB project group currently provides composite numbers in two categories. The project group believes that graphics characteristics within these categories are similar enough to make a single figure of merit credible.
The single figure of merit number is computed by taking the geometric mean for both the PLBlit and PLBopt numbers for standard benchmark files in a given category. If an "NR" is reported for a given benchmark file within a category, the submitter must report an "NR" for the single figure of merit; numbers can still be reported, however, for individual benchmark files within that category. Where an "NP" is listed for a PLBlit number, it is replaced with a value that is one-third that of the optimized number in order to calculate the single figure of merit for the category. Where "NA" (not available) is listed for a PLBopt number, the PLBopt value is considered the same as the PLBlit value for the particular standard benchmark file.
The reporting format used in this report is a summary of the detailed report generated by the Benchmark Reporting Format (BRF) component of the PLB. This standardized report lists data about the hardware and software configuration of the system tested and summarizes key characteristics of the test images. Included in the summary is data about the average number of primitives displayed, the number of times called per frame, data pertaining to graphics attributes and matrix operations, and a global exceptions table. Test loop timing information lists the number of frames displayed, elapsed time, and average time per frame.
The BRF provides enough details about the test to enable users to accurately compare systems. Should a vendor implement a test substantially different from a competitor's, the detailed data is likely to show how the discrepancy occurred.