Skip navigation

Standard Performance Evaluation Corporation

Facebook logo LinkedIn logo Twitter logo
 
 

SPEC SDM Suite (Retired)


System Development: Multi-tasking

This benchmark suite is designed to find out how a system handles an environment with a large number of users issuing typical software development commands: make, cp, diff, grep, man, mkdir, spell, etc...

Results

The SDM results since September '94 are available on line.

Description

There are two benchmark tests in this suite: 057.sdet and 061.kenbus1. Both are implemented by driving programs feeding randomly ordered scripts to the shell. Both are characterized by heavy commands usage (though usually with very short individual execution times), significant file system activity (especially to tmp devices), and large numbers of concurrent processes.

These tests measure how many scripts per hour can be completed by the system under test. These scripts are generated before the test by creating a set of scripts (one for each user) by combining separate subtasks taken in a random order. Each user is then given a home directory that is populated with the appropriate directory tree and files. When all is set up, a shell is started for each user and each is passed its own execution script. When the last one finally finishes the timer is stopped and an overall script execution rate is calculated.

057.SDET

057.sdet is SPEC's version of a workload known by various names including Gaede and TPD. It was developed years ago by Steve Gaede of AT&T as a means of getting a reasonable measure of how much a system could handle and how it managed when overloaded. The Gaede test is still in use today by parts of AT&T in their purchasing decisions.

More information is included in the benchmark description file which was written by the people submitting the benchmark to SPEC. A list of utilized commands is also available.

"Perspectives on the SPEC SDET Benchmark" - A paper by Steven L. Gaede, Lone Eagle Systems, Inc.

061.Kenbus1

061.kenbus1 is SPEC's version of the MUSBUS workload developed by Ken McDonell when he was at Monash University. MUSBUS was intended to be a useful QA and performance tool that would be widely available across many UN*X's. The version of MUSBUS that SPEC used as its base was version 5.2.

More information is included in the benchmark description file which was written by the people submitting the benchmark to SPEC. A list of utilized commands is also available.

Differences between the tests

The most notable difference between 057.sdet and 061.kenbus1 is that kenbus implements the notion of a typing rate of 3 chars/sec when feeding input to any of the commands. SDET just reads stdin from a file or a shell "here-document". The typing rate has several impacts. Most notably the number of users required to saturate the system with kenbus is 4-5 times the number of users for SDET. Additionally, the typing rate means processes block and wakeup a lot more frequently.

In the other direction, the command set used in SDET is a lot richer and many of the commands do a lot more work then in Kenbus1. This results in SDET being a more strenous test of the commands and tile system subsystems.

Tuning

First of all, both tests require fairly serious configurations. Be sure to have large enough kernel tables (swap space, process table [NPROC], file table [NFILES], etc.) to run several dozens (SDET) or hundreds (Kenbus1) of users at several processes per user.

It is important to spread out the IO as much as is possible. It is very difficult to tune these benchmarks without a sar(1) or some other monitor of cpu, file system, and disk utilization. The following file systems should be on different disks if possible: /, swap, /tmp, /usr/tmp, and $SPEC/. Additionally, the home directories for the users should be spread amongst other available disks. Each user uses only about 1MB of disk space in their home directories during the runs.

Using the above configuration and a performance monitor, one should then tune the configuration so that no one disk is over-utilized and all the available CPU time is used up. If any one disk is trying to do too much IO then it will be difficult to use all the CPU and the test will take longer to complete; move some of the active directories off to another disk mechanism. If not all the CPU is utilized then it is possible to get higher performance; run a higher number of users, or if more users do not provide better performance but there is still idle CPU left then use the performance monitors to find the bottleneck (usually an overloaded disk).