Skip navigation

Standard Performance Evaluation Corporation


SPEC Benchmarking Joint US/Europe Colloquium
Program Details

System Balance and Application Balance in Cost/Performance Optimization
John McCalpin, AMD

The concept of "balance" is commonly used in the discussion of computer systems, typically with the implication that some systems are "well-balanced" for some workload of interest, while others systems are "unbalanced" or "poorly balanced" for the workload of interest. Surprisingly, very few quantitative discussions of "system balance" and/or "application balance" appear in the literature. Attempts to create objective quantifiable definitions quickly lead one to realize that there are significant subtleties here that are quite important in understanding the interaction of technologies with customer buying behavior. In this talk, I will combine simple performance models of the SPEC CFP2000 and CFP2006 benchmarks with simple cost models for computer hardware to show that the "optimum" balance for a system is strongly dependent on the specific metrics that one is trying to optimize, and that the mathematical equation for optimum balance provides decidedly non-intuitive results.

[back to program details]

What's New With SPEC CPU2006: Changes to Benchmarks, Metrics, Run Rules, and Technical Challenge
John Henning, Sun Microsystems

John will cover the topics listed in the title of the talk, including at least two controversial issues, one strangled metaphor, and a free CPU2006 technical gift for the first 40 people who ask for it. (Sit near the front of the room to increase your chances of receiving the gift.)

[back to program details]

The CPU2006 Benchmark Tools
Cloyce Spradling, Sun Microsystems

The benchmarks that make up the SPEC CPU2006 and MPI2007 benchmark suites are set-up, run, timed, and scored by a tools harness. These tools have evolved over time from a collection of edit-it-yourself makefiles, shell scripts, and an Excel spreadsheet to the current Perl-based suite. The basic purpose of the tools is to make life easier for the benchmarker; they make it easier to tweak compilation settings, easier to keep track of those settings, and most importantly they make it easier to follow the run and reporting rules.

This presentation will give a basic overview of how the tools normally operate to generate benchmark scores. The course of a normal score-generating run is followed from setup to report generation with a focus on exactly how the benchmark runs are timed. A couple of new features designed to ease the burden of benchmarking will also be discussed. It will also cover features designed to aid those wishing to use the benchmarks for research purposes, such as how modified benchmark sources may be tested, ways to work around the tools when they get in your way, and how the tools can help with profiling workloads.

[back to program details]

A Journalist's Experiences with CPU2006
Andreas Stiller, c't Magazine (Heise Verlag)

Short introduction of c't and iX, both with more than 12 years of experience with SPEC

Three main differences in the approach of the users of the benchmarks:

  1. Companies:
    Let their "superhero" shine as best as possible
  2. Academic:
    Look for special aspects, often combined with the use of CPU benchmarks as the (one and only?) fundament of new developments in the processor and computer area (as to be seen in nearly all of the papers of the ISCA symposia). SPEC2000 is the most accepted value for HPC-performance, i.e. kSI2K (kiloSPECint2000) for huge grids like the LHC-Grid.
  3. Journalistic (the main topic here) :
    Try to make a review as fair as possible. Keeping track with each and every new processor (c't mainly in the Mobile/Desktop/Workstation space, iX for Workstations & Servers). Use of the newest compiler possibilities to explore the potentials of new processors (Hyper-Threading of Itanium Montecito, Helper-Threads, Auto Vectorization, Auto Parallelisation ...). Making published SPEC results more transparent.

Some important aspects of our approach

  • Use of "real world compilers" in addition to specialized compilers : i.e Microsoft Compilers for Windows and GNU for Linux. 32 Bit OS are still very important for c't readership
  • Use of current RTLs (i.e. from VS2005) although they are a bit slower than the predecessor VS2003, which logically is therefore preferred by a)...
  • Use of standard hardware (no super high speed "overclocked" memory etc)
  • No use of sophisticated libraries! But this rule can't be kept any more with CPU2006 and Windows XP/ Server 2003 because of heavy heap problems with some benchmarks. Some very strange results will be presented. A much better situation is the combination of Vista/Server 2008 and the RTL of VS2005. But some stack problems with the default configuration files of SPEC could occur.
  • Fairness includes sometimes even patching code or compilers (i.e. modern Intel compilers exclude for some optimizations the competition. Results of CPU2000 and CPU2006 (not published yet) with patched code will be presented.
  • No use of Peak, only Base! (it was a good idea to dispose the four-flag rule & FDO)

Propositions & wishes for the future

  1. Addressing multithreading (with synchronizing, locking, cache coherency protocol). That will be much more important then the current SPECrate. OpenMP and MPI should help.
  2. Addressing Vectorization: SIMD with native vector data types should be added
  3. Avoiding too much OS influence (like heap management in Windows 2K3)
  4. And finally: Much shorter runtimes (even Intel's processor generation in the far future: Gesher/Sandy Bridge will probably have not much more than 4 GHz clock speed ...)

[back to program details]

The HPC Challenge (HPCC) Benchmark Suite: Characterizing a System with Several Spezialized Kernels
Rolf Rabenseifner, High Performance Computing Center, Stuttgart

In 2003, the DARPA's High Productivity Computing Systems released the benchmark HPCC suite. It examines the performance of HPC architectures using kernels with various memory access and communication patterns of well known computational kernels. Consequently, HPCC results bound the performance of real applications as a function of memory access and communication characteristics and define performance boundaries of HPC architectures. The suite was intended to augment the TOP500 list and by now the results are publicly available for 6 out of 10 of the world's fastest computers.

This talk will introduce the several benchmarks used to characterize different system resources. The publicly available results are compared. The balance of systems is compared based on ratios between computational speed, memory and network bandwidth.

[back to program details]

SPEC MPI2007 - An Application Benchmark for Clusters and HPC Systems
Matthijs Van Waveren, Fujitsu

SPEC plans to release the SPEC MPI2007 benchmark suite at ISC2007. SPEC HPG has developed this benchmark suite and its run rules over the last few years. The purpose of the SPEC MPI2007 benchmark and its run rules is to further the cause of fair and objective benchmarking of high-performance computing systems. The rules help ensure that published results are meaningful, comparable to other results, and reproducible. MPI2007 includes 13 technical computing applications from the fields of Computational Fluid Dynamics, Molecular Dynamics, Electromagnetism, Geophysics, Ray Tracing, and Hydrodynamics. We describe the benchmark suite, and compare it to other benchmark suites.

[back to program details]

The SPECsfs2007 Benchmark – A Preview
Darren Sawyer, Network Appliance

The SPEC SFS subcommittee has been steadily working towards the release of SPECsfs2007, the first major update to SPEC SFS network fileserving benchmark in nearly 10 years. Using data collected by member companies from systems deployed at customers around the world, the SPECsfs2007 benchmark has made a number of changes to the original SPECsfs97 NFS workload to adapt for the realities of network fileserving today. These include an adjusted NFSv3 operation mix, increased file and transfer sizes, a much larger working set, elimination of NFSv2 and the UDP networking transport, addition of IPv6 support, and improved simulation of client commit patterns based on server responses to write requests, among other minor changes. How the SFS committee used collected data, industry trends, and experiences with the SPECsfs97 benchmark to identify the need for these revisions will be detailed.

In addition to changes in the NFS workload, high customer demand for the introduction of a Windows-based file serving benchmark led the committee to pursue adding a new workload using the CIFS protocol to the SFS benchmark. Adding CIFS proved to be no easy challenge. Unlike NFS, CIFS is a stateful protocol where certain operations are only likely or even possible when following a certain sequence of previous operations. The technique of using a simple, stateless random distribution of operations utilized by the NFS operation generation code would not suffice for CIFS. Thus, a new operation generation technique, based on generating ‘clusters’ of CIFS operations (aka CoCos) using a Hidden Markov Model (HMM) created from the patterns observed in real customer trace data, was developed not only to maintain proper sequences of operations required by the protocol, but also to better mimic the sequences of operations seen by real CIFS fileservers. The talk will describe this technique to some level of detail and will share the data used to generate the model ultimately used by the CIFS operation generation code in SPECsfs2007.

Following the workload discussions, a brief overview benchmark modifications to several non-workload facets of the benchmark to improve portability, usability, and improved results reporting and disclosure will be provided. The talk will conclude with suggestions for future directions of the SFS benchmark.

[back to program details]

Energy Efficiency of Storage Subsystems
Klaus-Dieter Lange, HP

The increasingly concerned with the energy usage of datacenters has the potential of drastically changing how the IT industry evaluates storage subsystems. Quantify the possible energy saving of utilizing modern storage subsystems by identify inherent energy characteristics of next generation disk IO subsystems. We further demonstrate the power and performance impact of a variety of workload patterns.

[back to program details]

CPU Performance/Power Measurements at the Grid Computing Centre Karlsruhe
Manfred Alef, Forschungszentrum Karlsruhe

One of the largest projects of high energy physics is the construction and operation of the Large Hadron Collider (LHC) at the European particle physics laboratory CERN. From 2007, the four detectors in a particle accelerator with a diameter of 9 km will produce about 10...15 petabytes per year. The LHC Computing Grid project (LCG) was launched in order to make the data available to several 1000 scientists worldwide.

In 2001 the Grid Computing Centre Karlsruhe (GridKa) was founded as the German LCG "tier 1" computing centre. It also provides grid services to other non-LHC experiments. At present, there are compute clusters with about 2500 CPU cores and disk storage of some 1.5 Petabytes installed.

As one of the first tasks, GridKa has started to estimate the electric power consumption, and the heat dissipation, of the computing centre. In order to overcome the limitations of the air conditioning system, GridKa has has installed water cooled computer cabinets as the first computing centre world wide. Furthermore, detailed investigations of the performance per Watt ratio of recent cluster nodes have been started.

In the presentation I will give a brief explanation of the Grid Computing Centre Karlsruhe, and describe how the performance and power measurements are used to improve procurements and infrastructure issues.

[back to program details]

SPECpower™ - Benchmarking the Energy Efficiency of Servers
Klaus-Dieter Lange, SPEC Power Chair

SPEC is developing the first generation SPEC benchmark for evaluating the energy efficiency of server class computers. The drive to create the SPECpower™ benchmark comes from the recognition that the IT industry, computer manufacturers, and government agencies are increasingly concerned with the energy usage of servers. Proven SPEC server benchmark concepts are utilized in order to provide a means to fairly and consistently report system energy use under various usage levels. Some critical design decisions of the benchmark suite will be covered.

[back to program details]

Future of SPECjvm (JVM2007)
Stefan Sarne, BEA

SPECjvm98 remains a valuable benchmark in many ways almost 10 years after its release, but is not generating new submissions. The Java subcommittee decided to try to address the obsolescence of this benchmark, at least for submission purposes, by upgrading the benchmark. We had several goals in place as we began this effort. This paper begins by explaining those goals and the reasons behind them, and then proceeds to the steps taken to try to address them. A key concern was making sure we could produce a benchmark that could exploit all the cores on a multi-core system; SPECjvm98 essentially ran single-threaded. We also explore some of the ways in which the benchmark development process gr ew, and include a description of some of the candidate sub-benchmarks and discuss some of the strengths and limitations of the suite.

[back to program details]

SPEC Enterprise Java Benchmarks: State of the Art and Future Directions
Sam Kounev, Technical University of Darmstadt/University of Cambridge

Enterprise Java benchmarks such as SPECjAppServer2002 and its successor SPECjAppServer2004 have received increasing attention over the past several years. This talk will discuss the latest developments in the field and will look at two new benchmarks that are currently developed at the Java Subcommittee. The first one is SPECjms2007 which will be the world's first industry standard benchmark for enterprise messaging platforms. The second one will become the successor of SPECjAppServer2004 for measuring the performance and scalability of Java EE platforms. The talk will present the current state of the two efforts and discuss some future work that has been planned.

[back to program details]