Skip navigation
 

SPECpower_ssj2008 Result File Fields

SPECpower_ssj2008 Run and Reporting Rules

Last updated: May 05 2008


(To check for possible updates to this document, please see http://www.spec.org/power_ssj2008/docs/SPECpower_ssj2008-Run_Reporting_Rules.html)

Overview

Selecting one of the following will take you to the detailed table of contents for that section:

1. Introduction

2. Run Rules

3. Reporting Rules

4. Submission Requirements for SPECpower_ssj2008

5. SPECpower_ssj2008 Benchmark Kit Overview


Detailed Contents

1. Introduction

1.1 Philosophy

1.1.1 Applicability

1.1.2 Optimizations

1.2 Caveats

1.3 Research and Academic Usage

2. Run Rules

2.1 Measurement

2.2 Initializing and Running Benchmark

2.3 Workload

2.3.1 Manual Intervention

2.3.2 Sequence of Target Loads

2.3.2.1 The Active Idle Interval

2.4 SUT Configuration Parameters

2.5 Benchmark Control Parameters

2.5.1 Warehouse Count and Override

2.5.2 Validity Checks

2.6 Optimization Flags

2.7 Testbed Configuration

2.8 Line Voltage Source

2.9 Environmental Conditions

2.10 General Availability

2.10.1 SUT Availability for Historical Systems

2.11 System Under Test (SUT)

2.11.1 Electrical Equivalence

2.11.2 Hardware

2.11.2.1 Network Interfaces

2.11.3 Software

2.12 Java Specifications

2.12.1 Feedback Optimization and Precompilation

2.12.2 Benchmark Binaries and Recompilation

2.13 Power and Temperature Measurement

2.13.1 Power Analyzer Setup

2.13.2 Power Analyzer Specifications

2.13.3 Temperature Sensor Specifications

2.13.4 Supported and Compliant Devices

2.13.5 Acceptance Process for New Measurement Devices

3. Reporting Rules

3.1 Reporting Metric and Result

3.1.1 Publication

3.1.2 Estimates

3.1.3 Comparison to Other Benchmark Suites

3.1.4 Addendum to OSG Fair Use Policy

3.2 Reproducibility

3.3 Testbed Configuration Disclosure

3.3.1 General Availability Dates

3.3.2 Test Sponsor

3.3.3 Benchmark Results Summary

3.3.4 SUT

3.3.4.1 System Class - Component Source

3.3.4.2 SUT Hardware

3.3.4.3 SUT Software

3.3.4.4 System Under Test Notes

3.3.5 Controller System

3.3.5.1 Power Analyzer and Temperature Sensor

3.3.6 Disclosure Notes

3.3.7 Electrical and Environmental Data

4. Submission Requirements for SPECpower_ssj2008

5. SPECpower_ssj2008 Benchmark Kit Overview

5.1 Documents overview


1. Introduction

SPECpower_ssj2008 is the first generation SPEC benchmark for evaluating the power and performance of server class computers. This document specifies the guidelines on how SPECpower_ssj2008 V1.00 is to be run for measuring and publicly reporting power and performance results of servers. These rules abide by the norms laid down by SPEC in order to ensure that results generated with this benchmark are meaningful, comparable to other generated results, and repeatable, with documentation covering factors pertinent to reproducing the results. Per the SPEC license agreement, all results publicly disclosed must adhere to these Run and Reporting Rules.

1.1 Philosophy

SPEC believes the user community will benefit from an objective series of benchmark results, which can serve as a common reference and be considered as part of an evaluation process. SPEC expects that any public use of results from this benchmark suite shall be for Systems Under Test (SUTs) and configurations that are appropriate for public consumption and comparison. For results to be publishable, SPEC expects:

  • Proper use of the SPEC benchmark tools as provided.
  • Availability of an appropriate full disclosure report (FDR).
  • Availability of the Hardware and Software used (see section 3.3.1).
  • Support for all of the appropriate protocols.

1.1.1 Applicability

SPEC intends that this benchmark measures the power and performance of systems providing environments for running server-side Java applications. It is not a J2EE benchmark and therefore it does not measure Enterprise Java Beans (EJBs), servlets, Java Server Pages (JSPs), etc. Power consumption measured by this benchmark should not be assumed to represent the power consumption of other applications on the same hardware.

1.1.2 Optimizations

SPEC is aware of the importance of optimizations in producing the best system power and performance. SPEC is also aware that it is sometimes difficult to draw an exact line between legitimate optimizations that happen to benefit SPEC benchmarks and optimizations that specifically target a SPEC benchmark. However, with the rules below, SPEC wants to increase the awareness of implementers and end users of issues of unwanted benchmark-specific optimizations that would be incompatible with SPEC's goal of fair benchmarking.

  • Hardware and software used to run the SPECpower_ssj2008 benchmark must provide a suitable environment for running typical server-side Java programs. (Note, this may be different from a typical environment for client Java programs.)
  • Software optimizations must generate correct code for a class of programs, where the class of programs must be larger than a single SPEC benchmark.
  • Hardware and/or software optimizations must improve power and/or performance for a class of programs, where the class of programs must be larger than a single SPEC benchmark.
  • The vendor encourages the implementation for general use.
  • The implementation is generally available, documented, and supported by the providing vendor(s).
Furthermore, SPEC expects that any public use of results from this benchmark shall be for configurations that are appropriate for public consumption and comparison. In the case where it appears that the above guidelines have not been followed, SPEC may investigate such a claim and take action in accordance to current policies.

1.2 Caveats

SPEC reserves the right to investigate any case where it appears that these guidelines and the associated benchmark run and reporting rules have not been followed for a published SPEC benchmark result. SPEC may request that the result be withdrawn from the public forum in which it appears and that the benchmarker correct any deficiency in product or process before submitting or publishing future results.

SPEC reserves the right to adapt the benchmark codes, workloads, and rules of SPECpower_ssj2008 as deemed necessary to preserve the goal of fair benchmarking. SPEC will notify members and licensees whenever it makes changes to the benchmark and may rename the metrics. In the event that the workload or metric is changed, SPEC reserves the right to republish in summary form "adapted" results for previously published systems, converted to the new metric. In the case of other changes, a republication may necessitate retesting and may require support from the original test sponsor.

Relevant standards are cited in these run rules as URL references, and are current as of the date of publication. Changes or updates to these referenced documents or URL's may necessitate repairs to the links and/or amendment of the run rules. The most current run rules will be available at the SPEC web site at httpp://www.spec.org. SPEC will notify members and licensees whenever it makes changes to the suite.

1.3 Research and Academic Usage

SPEC encourages use of the SPECpower_ssj2008 benchmark in academic and research environments. SPEC encourages researchers to obey as many of the run rules as practical. However, it is understood that experiments in such environments may be conducted in a less formal fashion than that demanded of licensees disclosing results which are intended to comply with the run rules. For example, a research environment may use early prototype hardware that simply cannot be expected to remain operational for the length of time required to run the required number of points, or may use research software versions that are unsupported and are not generally available. Adhering to the SPEC run rules improves the clarity, reproducibility, and comparability of research results. When the rules cannot be followed the run is non-compliant and cannot be represented as compliant and the benchmark metric cannot be used. Such results must not be compared to compliant results. SPEC requires research results be clearly distinguished from compliant results, and the disclosure of any and all deviations from the rules.

2 Run Rules

2.1 Measurement

The provided SPECpower_ssj2008 tools must be used to run and produce measured SPECpower_ssj2008 results. The SPECpower_ssj2008 metric is a function of the SPECpower_ssj2008 workload (see section 2.3), and the defined benchmark control parameters (see section 2.5). SPECpower_ssj2008 results are not comparable to any other application area power and performance metric.

2.2 Initializing and Running Benchmark

For guidance, please consult the User Guide and Hardware Setup Guide (http://www.spec.org/power_ssj2008/).

2.3 Workload

SPECpower_ssj2008 exercises a Java application workload. A detailed description can be found in the design document (http://www.spec.org/power_ssj2008/).

2.3.1 Manual Intervention

No manual intervention or optimization to the SUT or its internal and external environment is allowed during the benchmark run.

2.3.2 Sequence of Target Loads

The benchmark runs at multiple target loads to determine the power consumption of the SUT under varying processing loads. First, the maximum throughput achievable by the SUT is determined by running the workload unconstrained for at least 3 calibration intervals. The maximum is set as the arithmetic average of the throughputs achieved during the final two calibration interval runs. The workload is then run in a constrained manner, with delays inserted into the workload stream, to obtain total throughputs of 100%, 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, and 10% of the maximum throughput. The delays inserted into workload streams are exponentially random with a fixed maximum of 10 seconds. During each of these target loads, the power characteristics of the SUT as well as the temperature are recorded. Finally, the power characteristics and temperature are measured and recorded during an idle interval during which the SUT processes no Java transactions. The preceding sequence is automatically implemented by the benchmark harness and must not be changed for a compliant run.

2.3.2.1 The Active Idle Interval

During active idle, the SUT must be in a state in which it is capable of completing workload transactions. The active idle measurement interval is treated in a manner consistent with all other target load levels, with the exception that no transactions occur during the active idle interval.

The intent in defining and automating active idle power measurement within the SPECpower_ssj2008 benchmark is to prevent manipulation of idle power measurements. The benchmark workload and the JVM process in which it is running are to remain active without interruption for the duration of this phase.

2.4 SUT Configuration Parameters

The "SPECpower_ssj_config.props" file contains configuration information used to generate the final report, and values must be populated appropriately by the tester to reflect the SUT for a compliant run.

2.5 Benchmark Control Parameters

There are a number of parameters which control the operation of SPECpower_ssj2008. The "SPECpower_ssj.props" file is used to control the parameters of the benchmark run. The properties in the "Changeable Input Parameters" section of the benchmark parameters properties file may be set to values other than default for a compliant run. These are marked gray in the table below. For a compliant run, the properties in the "Fixed Input Parameters" section of the properties file being used must not be changed from the values as provided by SPEC. All workload JVM instances must use the same parameters in a multi-JVM environment.

Parameter name Compliant Value Adjustable for a Compliant Run
calibration.interval_count 10 >= integer >= 3 Yes
calibration.length_seconds 240 No
ccs.enabled true No
director.connect_timeout any Yes
director.enabled true No
director.hostname any Yes
deterministic_random_seed false No
idle.length_seconds 240 No
idle.post_calibration false No
idle.post_run true No
idle.pre_calibration false No
idle.settle_seconds 0 No
include_file any Yes
jvm_instances any Yes
jvm_instances_all_hosts jvm_instances No
load_level.count 10 No
load_level.delay_between 10 No
load_level.length_seconds 240 No
load_level.number_warehouses availableProcessors (see 2.5.1) Yes
load_level.percentage_sequence none No
load_level.post_measurement_seconds 30 No
load_level.pre_measurement_seconds 30 No
load_level.target_max_throughput -1 No
load_level.throughput_sequence none No
log_level INFO No
orderlines_per_order 10 No
output_directory any Yes
override_itemtable_size 20000 No
power_meter.enabled false No
power_meter.hostname any Yes
power_meter.port any Yes
scheduler.batch_size 1000 No
scheduler.log_arrival_rates false No
scheduler.max_arrival_delay 10 No
scheduler.number_threads availableProcessors / jvmInstances No
scheduler.single_queue false No
screen_write false No
show_warehouse_detail false No
status.port any Yes
steady_state true No
suite SPECpower_ssj No
transaction_mix.cust_report 10 No
transaction_mix.delivery 1 No
transaction_mix.new_order 10 No
transaction_mix.order_status 1 No
transaction_mix.payment 10 No
transaction_mix.stock_level 1 No
transaction.response_time None No
warehouse_population 60 No

Table 2.5-1 Compliant Values for Benchmark Parameters

2.5.1 Warehouse Count and Override

The SPECpower_ssj2008 benchmark runs a fixed number of warehouses, N, equal to the number of logical processors in the system under test. This number is, by default, the value returned by the java.lang.Runtime.getRuntime.availableProcessors API. The value may be overridden by setting the input.load_level.number_warehouses property provided that the result is submitted to SPEC for review and an acceptable reason is given in the config.sw.notes section of the disclosure report. An example of an acceptable reason to override the default value would be if System.availableProcessors() does not return an accurate or valid value for the hardware architecture of the SUT. An example of an unacceptable reason would be to decrease the value of N from the default to hide scalability problems and artificially obtain a higher score.

2.5.2 Validity Checks

At the beginning of each run, the benchmark parameters are checked for conformance to the run rules. Warnings are displayed for non-compliant properties and printed in the final report; however, the benchmark will run to completion producing a report that is not valid for publication.

The following are required for a valid run and are automatically checked:
  • Input Properties:
    • Verify that all input properties meet the criteria defined in section 2.5.
  • Contains all required intervals as defined in section 2.3.2
  • The measurement interval specified in the properties file must be 240 seconds. The actual measurement interval for each load point must be no less than 238.8 seconds (-0.5%) and no greater than 242.4 seconds (+1.0%). This rule allows for some variation in communicating the end of measurement to the threads.
  • In order to ensure that the measurement interval for each JVM occurs during the time all instances are running the following requirement must be met:
    • No JVM can start the measurement interval before all JVMs have entered the pre-measurement interval.
    • No JVM can end the measurement interval before any JVM has completed the post-measurement interval.
    • Ensure that all measurement intervals fully overlap across all JVMs.
  • Interval Length
    • Verify that the elapsed time for each measurement interval is at least 99.5% but no more than 101% of the configured interval length.
  • Temperature
    • Verify that the minimum temperature reading is >= 20°C (from the beginning of the run until the end of Active Idle -- including calibration).
  • Overall Target Load Throughput
    • Ensure that the combined throughput at each load level is within a limit
    • +2%, -2.5% for the 100% and 90% target loads
    • ± 2% for the 80% though 10% target loads
  • Recompilation
    • Make sure that ssj.jar is first in the classpath, and verify that the code has not been recompiled.
  • Power Analyzer
    • Ensure that a power analyzer was used, and that PTDaemon marked it as a compliant device.
  • Power Error Readings
    • Validate the percentage of error readings from the power analyzer.
    • Threshold is 1% for Power and 2% for Volt, Ampere and Power Factor, measured only during measurement interval.
  • Target Load Deviations
    • The sum of the throughputs at all load levels must be within 1% of the sum of all targets.
  • Temperature Error Readings
    • Validate the percentage of error readings from the temperature sensor. Threshold is 2%, measured from beginning of the run until the end of Active Idle, including calibration.
  • Temperature Sensor
    • Ensure that a temperature sensor was used, and that PTDaemon marked it as a compliant device.

2.6 Optimization Flags

Both JVMs and native compilers are capable of modifying their behavior based on flags. Flags which do not break conformance to section 2.12 are allowed. All command-line flags used must be reported. All flags used must be documented and supported within the time frame specified in this document for general availability. At the time a result is submitted to SPEC, descriptions of all flags used but not currently publicly documented must be available to SPEC for the review process. When the result is published, all flags used must be publicly documented, either in the vendor's public documentation, in the disclosure, or in a separate flags file.

2.7 Testbed Configuration

These requirements apply to all hardware and software components used in producing the benchmark result, including the System under Test (SUT), network, and controller.

  • Any deviations from the standard default configuration for testbed configuration components must be documented so an independent party would be able to reproduce the configuration and the result without any further assistance.
  • The controller system must be run on a physically different system than the SUT.
  • There is no restriction on which machine the jvm director must run.

2.8 Line Voltage Source

The preferred Line Voltage source used for measurements is the main AC power as provided by local utility companies. Power generated from other sources often has unwanted harmonics which are incapable of being measured correctly by many power analyzers, and thus would generate inaccurate results.

  • The Line Voltage Source needs to meet the following characteristics:
    • Frequency: (60Hz, 50Hz) ± 1%
    • Voltage: (120V, 110V, 100V, 208V, 220V, 230V) ± 5%

The usage of an uninterruptible power source (UPS) as the line voltage source is allowed, but the voltage output must be a pure sine-wave. For placement of the UPS, see 2.13.1. This usage must be specified in the Note-section of the FDR.

If an unlisted line voltage source is used, a reference to the standard is necessary.

For situations in which the appropriate voltages are not provided by local utility companies (e.g. measuring a server in the United States which is configured for European markets, or measuring a server in a location where the local utility line voltage does not meet the required characteristics), an AC power source may be used, and the power source must be specified in the notes section of the disclosure report. In such situation the following requirements must be met, and the relevant measurements or power source specifications disclosed in the notes section of the disclosure report:

  • Total Harmonic Distortion of source voltage (loaded), based on IEC standards: < 5%
  • The AC Power Source needs to meet the frequency and voltage characteristics previously listed in this section.
  • The AC Power Source must not manipulate its output in a way that would alter the power measurements compared to a measurement made using a compliant line voltage source without the power source.

The intent is that the AC power source not interferes with measurements such as power factor by trying to adjust its output power to improve the power factor of the load.

2.9 Environmental Conditions

SPEC requires that power measurements be taken in an environment representative of the majority of usage environments. The intent is to discourage extreme environments that may artificially impact power consumption or performance of the server.

SPECpower_ssj2008 requires the following environmental conditions to be met:
  • Ambient temperature range: 20°C or above
  • Elevation: within documented operating specification of SUT
  • Humidity: within documented operating specification of SUT

2.10 General Availability

The entire testbed must be comprised of components that are generally available on or before date of publication, or shall be generally available within three months of the first publication of these results.

Products are considered generally available if they are orderable by ordinary customers and ship within a reasonable time frame. This time frame is a function of the product size and classification and common practice. Some limited quantity of the product must have shipped on or before the close of the stated availability window. Shipped products do not have to match the tested configuration in terms of CPU count, memory size, and disk count or size, but the tested configuration must be available to ordinary customers. The availability of support and documentation of the products must be coincident with the release of the products.

Hardware products that are still supported by their original or primary vendor may be used if their original general availability date was within the last five years. The five-year limit is waived for hardware used in client systems.

Software products that are still supported by their original or primary vendor may be used if their original general availability date was within the last three years.

Information must be provided in the disclosure to identify any component that is no longer orderable by ordinary customers.

See http://www.spec.org/osg/policy.html - Appendix C. Guidelines for General Availability

2.10.1 SUT Availability for Historical Systems

In the interest of providing some historical perspective on the power consumption and performance for older or obsolete systems, SPEC will consider review and publication of otherwise compliant benchmark results. This is expressly for systems where the general availability date is/was beyond the current general availability limits and may no longer be supported by their original vendor.

Submissions will be reviewed and accepted with the stipulation that the hardware availability date reflect the first availability of the platform (as opposed to the date of purchase). If the SUT is "parts built", the general availability date shall reflect that of the "core components", i.e. the chipset and processor combination.

For these historic results, this entry must be included in the SUT Notes section:
"The general availability of this SUT is more than 5 years in the past, is generally considered obsolete and may or may not be supported by the original vendor or other third party. This benchmark result is intended to provide past perspective on power and performance and represent results using the hardware and software detailed above. This measured result may not be representative of the result that would be measured were this benchmark run at the availability date with contemporary software."

For "historic results", it is expected that the benchmark will be run with software of similar vintage to that of the hardware where possible (the benchmark may require a later JVM).

2.11 System Under Test (SUT)

The SPECpower_ssj2008 benchmark is intended to measure the power and performance attributes of a computer server under the specific benchmark workload. In the sense of this benchmark the server or system under test (SUT) is defined by the following properties:

  • It is a single node platform or one machine only, e.g. clusters and blade servers are not allowed.
  • The server runs a single OS image.
  • One or more instances of a single Java application are installed.
  • A Java run time environment and a network connection must be supported.

2.11.1 Electrical Equivalence

Many other SPEC benchmarks allow duplicate submissions for a single system sold under various names. Each SPECpower_ssj2008 result submitted to SPEC or made public must be for an actual run of the benchmark on the SUT named in the result. Electrically equivalent submissions are not allowed.

2.11.2 Hardware

Any hardware configuration sufficient to install, start, and run the benchmark to completion in compliance with these run rules (including the availability guidelines in 2.10) shall be considered a compliant configuration. Any device configured at the time the benchmark is started must remain configured for the duration of the benchmark run. Devices which are configured but not needed for the benchmark (e.g. additional on-board NICs) may be disabled prior to the start of the benchmark run. Manual intervention to change the configuration state of components after the benchmark run has begun is not allowed.

External devices required for initial setup or maintenance of the SUT, but not required for normal operation or for running the benchmark (e.g. an external optical drive used for OS installation) may be removed prior to the benchmark being started.

If the model name or product number implies a specific hardware configuration, these specific components can not be removed from the hardware configuration but may be upgraded. Any upgrades are subject to the support, availability and reporting requirements of this document. For example, if the SUT is available from the vendor only with dual power supplies, both supplies must be configured and measured during the benchmark run. The power supplies may be upgraded if the vendor offers and supports such an upgrade, and the upgrade must be documented in the benchmark disclosure report.

A video monitor, if configured, may be powered by a separate power source and need not be included in the power measurement of the SUT. All other configured devices must receive their power from the measured power source.

The components are required to be:
  • specified using customer-recognizable names,
  • documented and supported by the providing vendor, and
  • of production quality.

Any tuning or deviation from the default installation or configuration of hardware components is allowed by available tools only and must be reported. This includes BIOS settings, power saving options in the system board management, or upgrade of default components. Only modifications that are documented and supported by the vendor(s) are allowed.

2.11.2.1 Network Interfaces

At least one port of the SUT´s fastest network interface controller must be connected and operating at its full rated speed or 1Gb.

Automatically reducing network speed and power consumption in response to traffic levels is allowed for network interface controllers with such capabilities, as long as they are also capable of increasing to their full rated speed automatically.

2.11.3 Software

All software required to run the SPECpower_ssj2008 benchmark must be installed on and executed from a stable storage device which is considered part of the SUT. Required software components are

  • A single image operating system including all modules that are installed during the installation process.
  • A Java run time environment including one or more instances of a Java Virtual Machine (JVM).

Optional power management software, when installed, must be reported. The operating system must be in a state sufficient to execute a class of server applications larger than the benchmark alone. The majority of operating system services should remain enabled. Disabling operating system services may subject disclosures to additional scrutiny by the benchmark subcommittee and may cause the result to be found non-compliant. Any changes from the default state of the installed software must be disclosed in sufficient detail to enable the results to be reproduced. Examples of tuning information which must be documented include, but are not limited to:

  • Description of System Tuning (includes any special OS parameters set, changes to standard daemons or services)
  • List of Java parameters and flags used
  • Any special per-JVM tuning for multi-JVM running (e.g. pinning JVMs to specific processors)

These changes must be "generally available", i.e., available, supported and documented. For example, if a special tool is needed to change the OS state, it must be available to users and documented by the vendor. The tester is expected to exercise due diligence regarding the reporting of tuning changes, to ensure that the disclosure correctly records the intended final product.

The software environment on the SUT is intended to be in a state where applications other than the benchmark could be supported. Disabling of operating system services is therefore discouraged but not explicitly prohibited. Disabled services must be disclosed.

The submitter/sponsor will be responsible for justifying the disabling of service(s).

Services that must not be disabled include but are not limited to logging services such as cron or event logger.

A list of active operating system services may be required to be provided for SPEC's results review. The submitter is required to generate and keep this list for the duration of the review period. Such a list may be obtained, for example, by:

  • Windows: net start
  • Solaris 10: svcs -a
  • Red Hat Linux: /sbin/runlevel; /sbin/chkconfig --list

2.12 Java Specifications

Tested systems must provide an environment suitable for running typical server-side J2SE 5.0 (or higher) applications. Any tested system must include an implementation of the Java (tm) Virtual Machine as described by the following references, or as amended by SPEC for later Java versions:

  • Java Virtual Machine Specification (second edition/ ISBN-13: 978-0201432947)
The following are specifically allowed, within the bounds of the Java Platform:
  • Precompilation and on-disk storage of compiled executables are specifically allowed. However, support for dynamic loading is required. Additional rules are defined in section 2.12.2. See section 2.6 for details about allowable flags for compilation.

The system must include a complete implementation of those classes that are referenced by this benchmark as in the J2SE 5.0 specification (http://java.sun.com/j2se/1.5.0/). SPEC does not intend to check for implementation of APIs not used in the benchmark. For example, the benchmark does not use AWT (Abstract Window Toolkit, http://java.sun.com/j2se/1.5.0/docs/guide/awt/index.html, and SPEC does not intend to check for implementation of AWT. Note that the reporter does use AWT, however it is not necessary to run the reporter on the SUT.

2.12.1 Feedback Optimization and Precompilation

Feedback directed optimization and precompilation from the Java bytecodes are allowed, subject to the restrictions regarding benchmark-specific optimizations in section 1.1.2. Precompilation and feedback-optimization before the measured invocation of the benchmark are also allowed. Such optimizations must be fully disclosed.

2.12.2 Benchmark Binaries and Recompilation

The SPECpower_ssj2008 benchmark binaries are provided in jar files containing the Java classes. Valid runs must use the provided jar files and these files must not be updated or modified in any way. While the source code of the benchmark is provided for reference, the benchmarker must not recompile any of the provided .java files. Any runs that use recompiled class files are marked not valid and can not be reported or published.

2.13 Power and Temperature Measurement

The SPECpower_ssj2008 benchmark tool set provides the ability to automatically gather measurement data from supported power analyzers and temperature sensors and integrate that data into the benchmark result. SPEC requires that the analyzers and sensors used in a submission be supported by the measurement framework, and be compliant with the specifications in the following sections. SPEC also encourages licensees to submit code supporting additional devices that meet the benchmark requirements, which can be incorporated into later releases.

2.13.1 Power Analyzer Setup

The power analyzer must be located between the AC Line Voltage Source and the SUT. No other active components are allowed between the AC Line Voltage Source and the SUT.

Power analyzer configuration settings that are set by PTDaemon must not be manually overridden.

2.13.2 Power Analyzer Specifications

To ensure comparability and repeatability of power measurements, SPEC requires the following attributes for the power measurement device used during the benchmark. Please note that a power analyzer may meet the requirements when used in some power ranges but not in others, due to the dynamic nature of power analyzer Accuracy and Crest Factor.

  • Measurements - the analyzer must report true RMS power (watts), voltage, amperes and power factor.
  • Accuracy - Measurements must be reported by the analyzer with an overall accuracy of 1% or better for the ranges measured during the benchmark run. Overall accuracy means the sum of all specified analyzer uncertainties for the ranges and frequency being measured.
  • Calibration - the analyzer must be able to be calibrated by a standard traceable to NIST (U.S.A.) (http://nist.gov) or a counterpart national metrology institute in other countries. The analyzer must have been calibrated within the past year.
  • Crest Factor - The analyzer must provide a current crest factor of a minimum value of 3. For Analyzers which do not specify the crest factor, the analyzer must be capable of measuring an amperage spike of at least 3 times the maximum amperage measured during any 1-second sample of the benchmark test.
  • Logging - The analyzer must have an interface that allows its measurements to be read by the PTDaemon. The reading rate supported by the analyzer must be at least 1 set of measurements per second, where set is defined as watts and at least 2 of the following readings: volts, amps and power factor. The data averaging interval of the analyzer must be either 1 (preferred) or 2 times the reading interval. "Data averaging interval" is defined as the time period over which all samples captured by the high-speed sampling electronics of the analyzer are averaged to provide the measurement set.
For example:

An analyzer with a vendor-specified accuracy of +/- 0.5% of reading +/- 4 digits, used in a test with a maximum wattage value of 200W, would have "overall" accuracy of (((0.5%*200W)+0.4W)=1.4W/200W) or 0.7% at 200W.

An analyzer with a wattage range 20-400W, with a vendor-specified accuracy of +/- 0.25% of range +/- 4 digits, used in a test with a maximum wattage value of 200W, would have "overall" accuracy of (((0.25%*400W)+0.4W)=1.4W/200W) or 0.7% at 200W.

2.13.3 Temperature Sensor Specifications

Temperature must be measured no more than 50mm in front of (upwind of) the main airflow inlet of the SUT. To ensure comparability and repeatability of temperature measurements, SPEC requires the following attributes for the temperature measurement device used during the benchmark:

  • Logging - The sensor must have an interface that allows its measurements to be read by the benchmark harness. The reading rate supported by the sensor must be at least 4 samples per minute.
  • Accuracy - Measurements must be reported by the sensor with an overall accuracy of +/- 0.5 degrees Celsius or better for the ranges measured during the benchmark run.

2.13.4 Supported and Compliant Devices

See Device List (http://www.spec.org/power_ssj2008/) for a list of currently supported (by the benchmark software) and compliant (in specifications) power analyzers and temperature sensors.

2.13.5 Acceptance Process for New Measurement Devices

Adding a new measurement device to the SPEC power measurement framework includes three components:

  • Providing documentation that the device meets the requirements of Section 2.8.3.1.
  • Adding a new source code module to SPEC's Power and Temperature Daemon to allow the benchmark software to control the device.
  • Performing tests with SPECpower_ssj2008 and SPEC tools to evaluate the actual behavior of the device.

Documentation to prove compliance with all required attributes must be provided. Publicly available documentation is preferred, but in special cases where a device vendor does not wish to disclose information perceived as proprietary, the device vendor may request its documentation remain SPEC Confidential.

For new device modules, all source code submitted to SPEC must include a signed SPEC Permission to Use Form ( http://www.spec.org/spec/docs/permission_to_use.pdf) and must be freely available for use by other members and licensees of the benchmark. Supporting documentation must be provided as needed for the review. Once the code has been submitted, SPEC will then review the code. Barring any issues, SPEC will then incorporate the device module into a new version of the benchmark. Compliant runs must be done with SPEC provided binaries only.

The final step is testing of the device to verify that it meets the run rules requirements of section 2.13. The intent of this testing is to ensure that results obtained with the device are comparable to results obtained with other measurement devices.

SPEC provides a series of tests (see SPEC´s Power Analyzer Acceptance Testing) that must be performed to determine power analyzer behavior under dynamic benchmark conditions. The preferred method of running these tests is to connect the new measurement device in series with another power analyzer that has already been accepted as compliant with the run rules requirements. These tests should be run by the submitter. In cases where the submitter does not have a currently-accepted power analyzer, a member of SPEC may volunteer to run those tests if a device is provided to them.

SPEC will review the test results against a set of criteria specified (see SPEC´s Power Analyzer Acceptance Testing). If questions arise, SPEC may ask that additional testing be performed. Once a set of satisfactory results is produced, the device will be accepted as compliant and incorporated into the next release of the benchmark software.

Note: Since only SPEC-provided binaries may be used for compliant results, it is recommended that the device acceptance process be started well in advance of any benchmark use of a new device.

3 Reporting Rules

In order to publicly disclose SPECpower_ssj2008 results, the tester must adhere to these reporting rules in addition to having followed the run rules above. The goal of the reporting rules is to ensure the system under test is sufficiently documented so that someone could reproduce the test and its results and to ensure that the tester has complied with the run rules.

3.1 Reporting Metric and Result

SPECpower_ssj2008 expresses power and performance in the terms of "overall ssj_ops/watt". Overall ssj_ops/watt represents the sum of the performance measured at each target load level (in ssj_ops) divided by the sum of the average power (in W) at each target load including active idle.

The report of results is an HTML file generated by the tools provided by SPEC. These tools must not be changed, except for portability reasons with prior SPEC approval. The tools perform error checking and will flag some error conditions as resulting in an "invalid run". However, these automatic checks are only there for debugging convenience, and do not relieve the benchmarker of the responsibility to check the results and follow the run and reporting rules.

The section of the ssj.wxyz.raw file that contains actual test measurement must not be altered. Corrections to the SUT descriptions may be made as needed to produce a properly documented disclosure.

3.1.1 Publication

Any benchmark result not in full compliance with the run and reporting rules must not be represented using the SPECpower_ssj2008 metrics.

SPEC requires that each licensee test location (city, state/province and country) submit a single compliant result for review, and have that result accepted, before publicly disclosing or representing as valid any compliant SPECpower_ssj2008 result. Only after acceptance of a compliant result by the subcommittee may the licensee publicly disclose any SPECpower_ssj2008 result produced in compliance with these run and reporting rules, without acceptance by the SPECpower subcommittee.The intent of this requirement is that the licensee demonstrates the ability to produce a compliant result before publicly disclosing additional results without review by the subcommittee.

3.1.1.1 Disclosure Requirement

If a SPECpower_ssj2008 licensee publicly discloses a SPECpower_ssj2008 result (for example in a press release, academic paper, magazine article, or public web site), any SPEC member may request that the FDR(s) from the run(s) be sent to SPEC. Such results must be made available to all interested members no later than 10 working days after the request.

Any SPEC member may request that the result and its FDR be reviewed by SPEC. If the tester does not wish to have the result posted on the SPEC web pages, the result will not be posted.

When public claims are made about SPECpower_ssj2008 results, whether by vendors or by academic researchers, SPEC reserves the right to take actions, for example if it should occur that the FDR is not made available, or shows substantially different performance from the tester's claim, or shows obvious violations of the run rules.

3.1.2 Estimates

Estimated results are not allowed to be publicly disclosed.

3.1.3 Comparison to Other Benchmark Suites

Power and performance results from SPECpower_ssj2008 must not be compared to the results from any other benchmark.

3.1.4 Addendum to OSG Fair Use Policy

Any entity choosing to make public statements using SPECpower_ssj2008 must follow the OSG Fair Use guidelines (http://www.spec.org/osg/fair_use-policy.html ) and following addendum to this policy:

For comparisons,
  1. if any measured data from the disclosure is used, the primary metric for the systems being compared must be disclosed in close proximity.
  2. when comparing measured performance and/or power data from any target load level, both the performance and the power results for that target load must also be used in close proximity.
  3. When comparing performance and/or power measurements at different target load levels, the comparisons must also include the performance and power at the 100% target load level in close proximity.

Since the Active Idle measurement point does not have a performance load level, comparisons of Active Idle points must include the primary metric and the performance at the 100% target load level in close proximity. The calibration throughputs must not be used (use the 100% target load level instead) "Close proximity" as used above is defined to mean in the same paragraph, in the same font style and size, and either within 100 words or on the same presentation slide. The following paragraphs are examples of acceptable language when publicly using SPECpower_ssj2008 for comparisons for the three cases listed above, including a 4th example for Active Idle comparisons:

  1. Server X used 350W@100% target load compared to Server Y's 470W@100% target load. Server X has a SPECpower_ssjTM2008 result of 247 overall ssj_ops/Watt and Server Y has a result of 415 overall ssj_ops/Watt.
  2. Server X with a SPECpower_ssjTM2008 result of 247 overall ssj_ops/Watt, uses only 185W@20% with 25,000 ssj_ops @ 20% target load compared to Server Y with a SPECpower_ssjTM2008 result of 212 overall ssj_ops/Watt, uses 213W@20% with 44,000 ssj_ops @ 20% target load
  3. Server X with a SPECpower_ssjTM2008 result of 247 overall ssj_ops/Watt and (125,000 ssj_ops and 200W)@100% target load achieved (25,000 ssj ops and 185W)@20%. Compared to Server Y with a SPECpower_ssjTM2008 result of 247 overall ssj_ops/Watt and (125,000 ssj_ops and 200W)@100% target load achieved (25,000 ssj ops and 187W)@10%.
  4. Server R uses only 50W at the Active Idle point, compared to 255W at Active Idle for Server Q. Server R has a SPECpower_ssjTM2008 result of 512 overall ssj_ops/Watt and (185,000 ssj_ops and 200W)@100% target load. Server R has a SPECpower_ssjTM2008 result of 630 overall ssj_ops/Watt and (240,000 ssj_ops and 200W)@100% target load.

Note that, for each of these examples, the following is also required, but could be included in a footnote: "SPEC and the benchmark name SPECpower_ssj are trademarks of the Standard Performance Evaluation Corporation. Benchmark results stated above reflect results published on http://www.spec.org as of November 30, 2007. For the latest SPECpower_ssj2008 benchmark results, visit http://www.spec.org/power_ssj2008."

3.2 Reproducibility

SPEC is aware that power or performance results for pre-production systems may sometimes be subject to change, for example when a last-minute bugfix reduces the final performance.

If the sponsor becomes aware that the SPECpower_ssj2008 metric of a typical released system is more than 5% lower than that reported for the pre-release system, the tester is required to submit a new result for the production system, and the original result shall be marked non-compliant (NC).

By submitting or publishing a benchmark disclosure (report) to SPEC, the test sponsor implicitly states that the system performance and power measured is representative of such systems. Power consumption is dependent on many factors that may vary over time within a specific vendor model. It can also vary from system to system due to well-known variability in electronic component fabrication processes.

3.3 Testbed Configuration Disclosure

The system configuration information that is required to reproduce published power and performance results must be reported. The principle is that if anything affects power or performance or is required to duplicate the results, it must be described. Any deviations from the standard, default configuration for the SUT must be documented so an independent party would be able to reproduce the result without any further assistance. For most of the following configuration details, there is an entry in the configuration file, and a corresponding entry in the tool-generated HTML result page. If information needs to be included that does not fit into these entries, the Notes sections must be used.

3.3.1 General Availability Dates

The dates of general customer availability must be listed for the major components: hardware, server software, and operating system, by month and year. All the system, hardware and software features are required to be available within three months of the first publication of these results. With multiple components having different availability dates, the latest availability date must be listed.

3.3.2 Test Sponsor

The reporting page must list:
  • Licensee which is reporting the results
  • SPEC license number of that organization
  • Date the test was performed, by month and year

3.3.3 Benchmark Results Summary

The reporter automatically populates the Benchmark Result Summary.

For each Target Load:
  • Performance: Target Load in %
  • Performance: Actual Load in %
  • Performance: Actual Load in ssj_ops
  • Average Power in W
  • Performance to Power Ration

Also a graphical representation of these values is automatically rendered.

3.3.4 SUT

3.3.4.1 System Class - Component Source

The component source, either "Single Supplier" or "Parts Built", needs to be disclosed

  • Single Supplier
    • "Single Supplier" is defined as a SUT configuration where all hardware is provided by a single supplier.
  • Parts Built
    • "Parts Built" is defined as a SUT configuration where hardware is provided by multiple suppliers. A "Parts Built" system disclosure must include enough detail to procure and reproduce all aspects of the submission, including performance and power.

3.3.4.2 SUT Hardware

The following SUT hardware components must be reported:
  • Hardware Vendor and Model
  • CPU Name, CPU Characteristics, CPU Frequency (MHz), CPU(s) Enabled, Hardware Threads / Core, CPU(s) Orderable, Primary Cache, Secondary Cache, Tertiary Cache, Other Cache, If a level of cache is shared among processors in a system that must be stated in the notes section of the disclosure.
  • Memory Amount (GB), # and size of DIMM, Memory Details, Memory configuration if this is an end-user option that may affect the metric, e.g. interleaving and access time.
  • Power Supply Quantity and Rating (W), Power Supply Details
  • Number, type, model, and capacity of Disk Drive, Disk Controller
  • # and type of Network Interface Cards (NICs) Installed, NICs Enabled in Firmware / OS / Connected, Network Speed (Mbit)
  • The connectivity to keyboard, mouse and monitor must to be stated (KVM, USB, PS2, or none)
  • Optical Drives
  • Other Hardware, e.g. write caches, or other accelerators

3.3.4.3 SUT Software

The following SUT software components must be reported:
  • All Power Management Options used, other than default
  • The Operating System (OS) name and version
  • Type of Filesystem:
  • JVM: Vendor, Version, Command-line Options, Affinity settings, Instances, Initial Heap (MB), Maximum Heap (MB), Address Bits
  • Benchmark Version
  • The location of the JVM director must be reported
  • Other Software, e.g. management components
  • Any other software packages used during the benchmarking process.
  • Other clarifying information as required to reproduce benchmark results; e.g. non-default kernel parameters, must be stated in the notes section of the disclosure.
  • Additionally, the submitter must be prepared to make available a description of the tuning features that were utilized; e.g. kernel parameters and software settings, including the purpose of that tuning feature. Where possible, it must be noted how the values used differ from the default settings for that tuning feature.

3.3.4.4 System Under Test Notes

The System Under Test Notes section is used to document:
  • System tuning parameters other than default.
  • Process tuning parameters other than default.
  • Background load, if any.
  • Critical customer-identifiable firmware or option versions such as network and disk controllers.
  • Definitions of tuning parameters must be included or a pointer supplied to a separate document hosted by SPEC.
  • Part numbers or sufficient information that would allow the end user to order the SUT configuration.
  • Identification of any components used that are supported but that are no longer orderable by ordinary customers.

3.3.5 Controller System

The following properties must be reported:
  • Controller System System Vendor and model
  • Controller System CPU description
  • Controller System Total memory amount
  • Controller System OS Type and Version
  • Controller System JVM Vendor and Version
  • CCS Version

3.3.5.1 Power Analyzer and Temperature Sensor

The following properties must be reported:
  • Power Analyzer Vendor and model
    • Power Analyzer Serial number
    • The connectivity to the Power Analyzer
    • Calibration of the Power Analyzer: Institute, Accredited by, Calibration label and date
    • Power analyzer voltage range
    • Power analyzer current range
    • Power analyzer input connections
  • PTDaemon host system, OS
    • PTDaemon Version
  • Temperature Sensor Vendor and model
    • Driver version
    • The connectivity to the Temperature Sensor
    • PTDaemon host system, OS

3.3.6 Disclosure Notes

The Notes section is used to document:
  • Additional important information required to reproduce the results from other reporting sections that require a larger text area.

3.3.7 Electrical and Environmental Data

The reporter automatically populates (values from measurements) the following table entries.

For each Target Load:

  • Average Voltage (V)
  • Average Current (A)
  • Average Power Factor
  • Average Power (W)
  • Minimum Ambient Temperature (°C)

The reporter also automatically populates measured values for the average Power Factor and the Minimum Temperature (°C).

  • The Line Standard must be reported manually (see 2.8).
  • The Elevation must be reported with an accuracy of 50m or better.
  • The Humidity needs not be reported.

4 Submission Requirements for SPECpower_ssj2008

When a potentially-compliant run is completed and acceptance by SPEC is desired, the raw results file must be submitted. The required file(s) should be e-mailed to SPEC as an attachment. The committee may request additional benchmark output files from the submitter as well. The submitter should be prepared to participate in discussion during the review cycle and at the subcommittee meeting in which the result is voted on for final acceptance, to answer any questions raised about the result. The submitter is also required to keep the log files for the SUT and Controller from the run for the duration of the review cycle and make them available upon request. Licensees of the benchmark wishing to submit results for acceptance may be required to pay a fee. The complete submission process is documented in Submitting OSG Benchmark Results to SPEC. (http://www.spec.org/osg/submitting_results.html).

5 SPECpower_ssj2008 Benchmark Kit Overview

The benchmark kit includes tools for running the benchmark and reporting its results. The workload and CCS components are written in Java; precompiled class files are included with the kit, so no build step is necessary. This software implements various checks for conformance with these run and reporting rules, therefore the SPEC software must be used.

The kit also includes C code for the PTDaemon and its device modules. Any new device modules will be evaluated by the sub-committee according to the acceptance process (see section 2.13.5). Once the code is accepted by the sub-committee, it will be made available for any licensee to use in their measurements and submissions.

5.1 Documents overview

The benchmark related documents (Run and Reporting Rules, User Guide, Hardware Setup Guide, Design Document, Methodology, and FAQ) can be found in the doc directory of the distribution. For the latest versions, please consult SPEC´s website ( http://www.spec.org/power_ssj2008/).


Copyright © 2007-2008 Standard Performance Evaluation Corporation
All Rights Reserved