Run and Reporting Rules for SPEC OMP2001

                       SPEC High Performance Group

                          Approved 3 August 2011

                           Effective 3 August 2011

                               Please see 
                http://www.spec.org/omp/docs/runrules.html 
                            for latest rules.






                                ABSTRACT
                                
        This document provides guidelines required for building,
             running, and reporting SPEC OMP2001 benchmarks.


                


Table of Contents
      Purpose
  1.  General Philosophy
  2.0 Building SPEC OMP2001
        2.0.1 Peak and base builds
        2.0.2 Runspec must be used
        2.0.3 The runspec build environment 
        2.0.4 Continuous Build requirement
        2.0.5 Changes to the runspec build environment 
        2.0.6 Cross-compilation allowed 
        2.0.7 Individual builds allowed
        2.0.8 Tester's assertion of equivalence between build types 
    2.1 General Rules for Optimizations         
        2.1.1 Limitations on library substitutions
        2.1.2 Feedback directed optimization is allowed
        2.1.3 Limitations on size changes
    2.2 Base Optimization Rules
        2.2.1 Safe
        2.2.2 Same for all 
        2.2.3 Feedback directed optimization is allowed in base
        2.2.4 Assertion flags may NOT be used in base
        2.2.5 Floating point reordering allowed
        2.2.6 Portability flags permitted
        2.2.7 Cannot use names
        2.2.7.1 Exceptions
    2.3 Peak Optimization and Permitted Source Code Changes
  3.  Running SPEC OMP2001
    3.1 System Configuration
        3.1.1 File Systems 
        3.1.2 System State 
    3.2 Continuous Run Requirement
    3.3 Run-time environment
        3.3.1 General run-time environment rules
        3.3.2 Run-time environment modifications during peak runs
    3.4 Basepeak
  4.  Results Disclosure
    4.1 Rules regarding availability date and systems not yet shipped
    4.2 Configuration Disclosure
        4.2.1 System Identification
        4.2.2 Hardware Configuration
        4.2.3 Software Configuration
        4.2.4 Tuning Information
    4.3 Test Results Disclosure
        4.3.1 Metrics
    4.4 Metric Selection
    4.5 Research and Academic usage of OMP2001
    4.6 Fair Use                                
  5.  Run Rule Exceptions 
  



Purpose

  This document specifies how the benchmarks in the OMP2001 suites are to be run  for measuring and publicly reporting performance results, to ensure that 
  results generated with the suites are meaningful, comparable to other 
  generated results, and reproducible (with documentation covering factors 
  pertinent to reproducing the results). 

  Per the SPEC license agreement, all results publicly disclosed must
  adhere to the SPEC Run and Reporting Rules, or be clearly marked as
  estimates.

  The following basics are expected and clarified in the main body of
  the document: 

  - Adherence to the SPEC general run rule philosophy, including:

        + general availability of all components within 3 months of  
          publication.

        + providing a suitable environment for C/Fortran programs.

  - Use of the SPEC tools for all published results, including:

        + compilation of the benchmark with the SPEC tools.

        + requiring the worse result of two runs, or the median
	    of three or more runs of each benchmark
          to help promote stability and reproducibility.

        + requiring that a publishable result be generated with one
          invocation of the SPEC tools.

        + validating the benchmark output with the SPEC-provided 
          validation output to ensure that the benchmark ran to 
          completion and generated correct results. 

  - Adherence to the criteria for flag selection, including:

        + proper use of feedback directed optimization for both base and 
          peak measurements.

  - Availability of a full disclosure report

  - Clear distinction between measurements and estimates

  Each of these points are discussed in further detail below. 
 
  Suggestions for improving this run methodology should be made to the
  SPEC High Performance Group (HPG) for consideration in future releases. 





1. General Philosophy
  
  SPEC believes the user community will benefit from an objective series
  of tests which can serve as common reference and be considered as part
  of an evaluation process.  

  SPEC OMP2001 provides benchmarks in the form of source code, which are
  compiled according to the rules contained in this document.   It is
  expected that a tester can obtain a copy of the suites, install the
  hardware, compilers, and other software described in another tester's
  result disclosure, and reproduce the claimed performance (within a
  small range to allow for run-to-run variation).
 
  Two benchmark suites are provided: OMPM2001 with a medium working
  set size (larger than SPEC CPU2000), and OMPL2001 with an even 
  larger working set size.  Both suites use OpenMP directives.

  SPEC is aware of the importance of optimizations in producing the best
  system performance.  SPEC is also aware that it is sometimes hard to
  draw an exact line between legitimate optimizations that happen to
  benefit SPEC benchmarks and optimizations that specifically target the
  SPEC benchmarks.  However, with the list below, SPEC wants to increase
  awareness of implementers and end users to issues of unwanted
  benchmark-specific optimizations that would be incompatible with
  SPEC's goal of fair benchmarking. 
  
  To ensure that results are relevant to end-users, SPEC expects that
  the hardware and software implementations used for the running the
  SPEC benchmarks adhere to following conventions: 
  
  - Hardware and software used to run the OMPM2001/OMPL2001 benchmarks
    must provide a suitable environment for running typical C, 
    and Fortran programs. 

  - Optimizations must generate correct code for a class of programs,
    where the class of programs must be larger than a single SPEC
    benchmark or SPEC benchmark suite.  This also applies to assertion
    flags that may be used for peak compilation measurements (2.2.4)
  
  - Optimizations must improve performance for a class of programs
    where the class of programs must be larger than a single SPEC
    benchmark or SPEC benchmark suite. 

  - The vendor encourages the implementation for general use.

  - The implementation is generally available, documented and supported 
    by the providing vendor. 

  In cases where it appears that the above guidelines have not been
  followed, SPEC may investigate such a claim and request that the
  offending optimization (e.g. a SPEC-benchmark specific pattern
  matching) be backed off and the results resubmitted.  Or, SPEC may
  request that the deficiency be corrected (e.g. make the
  optimization more general purpose or correct problems with code
  generation) before submitting results based on the optimization. 
 
  The SPEC High Performance Group reserves the right to adapt the OMPM2001
  and OMPL2001 suites as it deems necessary to preserve its goal of fair
  benchmarking (e.g. remove a benchmark, modify benchmark code or
  workload, etc).  If a change is made to a suite, SPEC will notify the
  appropriate parties (i.e. members and licensees).  SPEC may
  rename the metrics (e.g. changing the metric from SPECompM2001 to
  SPECompM2001a).  In the case that a benchmark is removed, SPEC
  reserves the right to republish in summary form adapted results for
  previously published systems, converted to the new metric.  In the case
  of other changes, such a republication may necessitate re-testing and may
  require support from the original test sponsor.   

  SPEC OMP2001 metrics may be estimated.  All estimates must be clearly
  identified as such.  Licensees are encouraged to give a rationale or
  methodology for any estimates, and to publish actual SPEC OMP2001
  metrics as soon as possible.  SPEC requires that every use of an
  estimated number be flagged, rather than burying an asterisk at the
  bottom of a page.  For example, say something like this:

      The JumboFast will achieve estimated performance of 
         Model 1   SPECompMpeak2001 50 est.
                   SPECompLpeak2001 60 est.
         Model 2   SPECompMpeak2001 70 est.
                   SPECompLpeak2001 80 est.

  The use of SPEC OMP2001 metrics is permitted only after submission 
  to SPEC, successful review and publication.  All other use of 
  SPEC OMP2001 metrics must clearly be identified as estimated 
  or under review.  Submitted results, not yet approved are 
  labeled as being under review.



2.0 Building SPEC OMP2001

  SPEC has adopted a set of rules defining how SPEC OMP2001 benchmark
  suite must be built and run to produce peak and base metrics. 

  2.0.1 Peak and base builds

  "Peak" metrics are produced by building each benchmark in the suite
  with a set of optimizations individually tailored for that benchmark. 
  The optimizations selected must adhere to the set of general benchmark
  optimization rules described in section 2.1 below.  Limited source code
  modifications are allowed related to Parallel Performance.
  
  "Base" metrics are produced by building all the benchmarks in the
  suite with a common set of optimizations and without any modifications
  to the source or directives.  In addition to the general
  benchmark optimization rules (section 2.1), base optimizations must
  adhere to a stricter set of rules described in section 2.2.   These
  additional rules serve to form a "baseline" of recommended performance
  optimizations for a given system. 
  
  2.0.2 Runspec must be used

  With the release of SPEC OMP2001 suite, a set of tools based on GNU
  Make and Perl5 are supplied to build and run the benchmarks.  To
  produce publishable results, these SPEC tools must be used. 
  This helps ensure reproducibility of results by requiring that all
  individual benchmarks in the suite are run in the same way and that a
  configuration file that defines the optimizations used is available. 

  The primary tool is called "runspec" (runspec.bat for Windows NT). It
  is described in the file runspec.html in the doc subdirectory of the
  SPEC root directory -- in a Bourne shell that would be called
  ${SPEC}/docs/runspec.html .

  SPEC supplies pre-compiled versions of the tools for a variety of
  platforms.  If a new platform is used, please see 
  ${SPEC}/docs/tools_build.txt for information on how to build the tools
  and how to obtain approval for them. 

  For more complex ways of compilation, for example feedback-driven 
  compilation, SPEC has provided hooks in the tools so that such 
  compilation and execution is possible (see the tools documentation,
  config.html, for details).  Only if, unexpectedly, such a compilation and 
  execution should not be possible, there is the possibility that the
  test sponsor can ask for a permission to use performance-neutral 
  alternatives (see section 5).

  2.0.3 The runspec build environment 
   
  When runspec is used to build the SPEC OMP2001 benchmarks, it must be
  used in generally available, documented, and supported environments
  (see section 1), and any aspects of the environment that contribute to
  performance must be disclosed to SPEC (see section 4).  
  
  On occasion, it may be possible to improve run time performance by
  environmental choices at build time.  For example, one might install
  a performance monitor, turn on an operating system feature such as
  bigpages, or set an environment variable that causes the cc driver to
  invoke a faster version of the linker.
  
  It is difficult to draw a precise line between environment settings 
  that are reasonable versus settings that are not.  Some settings are 
  obviously not relevant to performance (such as hostname), and SPEC 
  makes no attempt to regulate such settings.  But for settings that do
  have a performance effect, for the sake of clarity, SPEC has chosen 
  that: 
  
  (a) It is acceptable to install whatever software the tester wishes,
      including performance-enhancing software, provided that the 
      software is installed prior to starting the builds, remains
      installed throughout the builds, is documented, supported, 
      generally available, and disclosed to SPEC.
  
  (b) It is acceptable to set whatever system configuration parameters 
      the tester wishes, provided that these are applied at boot time, 
      documented, supported, generally available, and disclosed to 
      SPEC. "Dynamic" system parameters (i.e. ones that do not require a
      reboot) must nevertheless be applied at boot time, except as
      provided under section 2.0.5.
 
  (c) After the boot process is completed, environment settings may be 
      made as follows:
  
       * to specify resource limits, as in the Bourne shell "ulimit" 
         command, and
  
       * to select major components of the compilation system, as in: 
               setenv CC_LOC /net/dist/version73/cc
               setenv LD_LOC /net/opt/dist/ld-fast
  
      provided that these settings are documented; supported; generally 
      available; disclosed to SPEC; made PRIOR to starting the build;
      and do not change during the build, except as provided in section
      2.0.5.
  
  2.0.4 Continuous Build requirement

  As described in section 1, it is expected that testers can reproduce
  other testers' results.  In particular, it must be possible for a new
  tester to compile both the base and peak benchmarks for an entire
  suite (i.e. OMPM2001 or OMPL2001) in one execution of runspec, with
  appropriate command line arguments and an appropriate configuration
  file, and obtain executable binaries that are (from a performance
  point of view) equivalent to the binaries used by the original tester.

  The simplest and least error-prone way to meet this requirement is for
  the original tester to take production hardware, production software,
  a SPEC config file, and the SPEC tools and actually build the
  benchmarks in a single invocation of runspec on the System Under Test
  (SUT).  But SPEC realizes that there is a cost to benchmarking and
  would like to address this, for example through the rules that follow
  regarding cross-compilation and individual builds.  However, in all
  cases, the tester is taken to assert that the compiled executables
  will exhibit the same performance as if they all had been compiled
  with a single invocation of runspec (see 2.0.8).

  2.0.5 Changes to the runspec build environment 
 
  SPEC OMP2001 base binaries must be built using the environment rules
  of section 2.0.3, and may not rely upon any changes to the environment
  during the build.  

  Note 1: base cross compiles using multiple hosts are allowed (2.0.6), 
  but the performance of the resulting binaries is not allowed to
  depend upon environmental differences among the hosts.  It must be
  possible to build performance-equivalent base binaries with one set
  of switches (2.2.2), in one execution of runspec (2.0.4), on one
  host, with one environment (2.0.3).
  
  For a peak build, the environment may be changed, subject to the 
  following constraints:
 
     - The environment change must be accomplished using the SPEC-
       provided config file hooks (such as fdo_pre0).
 
     - The environment change must be fully disclosed to SPEC (see
       section 4).
 
     - The environment change must not be incompatible with a Continuous 
       Build (see section 2.0.4).
 
     - The environment change must be accomplished using simple shell
       commands (such as "/usr/opt/performance_monitor -start" or  
       "setenv BIGPAGES YES").  It is not permitted to invoke a more 
       complex entity (such as a shell script, batch file, kdbx script,
       or NT registry adjustment program) unless that entity is
       provided as part of a generally-available software package. For
       example, a script could use kdbx to adjust the setting of
       bigpages if that script were provided as a part of the OS, but
       the tester could not write his or her own scripts to hack the
       kernel just for SPEC.

  Note 2: peak cross compiles using multiple hosts are allowed (2.0.6), 
  but the performance of the resulting binaries is not allowed to
  depend upon environmental differences among the hosts.  It must be
  possible to build performance-equivalent peak binaries with one
  config file, in one execution of runspec (2.0.4), in the same
  execution of runspec that built the base binaries, on one host,
  starting from the environment used for the base build (2.0.3), and
  changing that environment only through config file hooks (2.0.5).
 
  2.0.6 Cross-compilation allowed 

  It is permitted to use cross-compilation, that is, a building process
  where the benchmark executables are built on a system (or systems)
  that differ(s) from the SUT.  The runspec tool must be used on all
  systems (typically with "-a build" on the host(s) and "-a validate"
  on the SUT).

  If all systems belong to the same product family and if the software
  used to build the executables is available on all systems, this does
  not need to be documented.  In the case of a true cross compilation,
  (e.g. if the software used to build the benchmark executables is not
  available on the SUT, or the host system provides performance gains
  via specialized tuning or hardware not on the SUT), the host system(s)
  and software used for the benchmark building process must be
  documented in the Notes section.  See section 4.

  It is permitted to use more than one host in a cross-compilation. If
  more than one host is used in a cross-compilation, they must be
  sufficiently equivalent so as not to violate rule 2.0.4.  That is, it
  must be possible to build the entire suite on a single host and obtain
  binaries that are equivalent to the binaries produced using multiple
  hosts.  

  The purpose of allowing multiple hosts is so that testers can save 
  time when recompiling many programs.  Multiple hosts may NOT be used 
  in order to gain performance advantages due to environmental 
  differences among the hosts.  In fact, the tester must exercise great 
  care to ensure that any environment differences are performance
  neutral among the hosts, for example by ensuring that each has the
  same version of the operating system, the same performance software,
  the same compilers, and the same libraries.  The tester should
  exercise due diligence to ensure that differences that appear to be
  performance neutral - such as differing MHz or differing memory
  amounts on the build hosts - are in fact truly neutral.

  Multiple hosts may NOT be used in order to work around system or
  compiler incompatibilities (e.g. compiling the SPEC OMPM2001 C 
  benchmarks on a different OS version than the SPEC OMPM2001 Fortran 
  benchmarks in order to meet the different compilers' respective OS 
  requirements), since that would violate the Continuous Build rule
  (2.0.4).

  2.0.7 Individual builds allowed

  It is permitted to build the benchmarks with multiple invocations of 
  runspec, for example during a tuning effort.  But, the executables 
  must be built using a consistent set of software.  If a change to the
  software environment is introduced (for example, installing a new
  version of the C compiler which is expected to improve the 
  performance of one of the medium benchmarks), then all 
  affected benchmarks must be rebuilt (in this example, all the C 
  benchmarks in the medium suite). 

  2.0.8 Tester's assertion of equivalence between build types 

  The previous 4 paragraphs may appear to contradict each other (2.0.4,
  2.0.5, 2.0.6, 2.0.7), but the key word in 2.0.4 is the word "possible".  
  Consider the following sequence of events:

    - A tester has built a complete set of OMP2001 executable images 
      ("binaries") on her usual host system.  

    - A hot new SUT arrives for a limited period of time.  It has no
      compilers installed.

    - A SPEC OMP2001 tree is installed on the SUT, along with the 
      binaries and config file generated on the usual host.

    - It is learned that performance could be improved if the peak
      version of 999.sluggard were compiled with -O5 instead of -O4.  

    - On the host system, the tester edits the config file to change to
      -O5 for 999.sluggard, and issues the command:

            runspec -c myconfig -D -a build -T peak sluggard

    - The tester copies the new binary and config file to the SUT

    - A complete run is started by issuing the command:

            runspec -c myconfig -a validate all

    - Performance is as expected, and the results are submitted to SPEC
      (including the config file).

  In this example, the tester is taken to be asserting that the above
  sequence of events produces binaries that are, from a performance
  point of view, equivalent to binaries that would have been produced
  in a single invocation of the tools.  If there is some optimization
  that can only be applied to individual benchmark builds and cannot be
  applied in a continuous build, the optimization is not allowed.  

  Rule 2.0.8 is intended to provide some guidance about the kinds of
  practices that are reasonable, but the ultimate responsibility for
  result reproducibility lies with the tester.  If the tester is
  uncertain whether a cross-compile or an individual benchmark build is
  equivalent to a full build on the SUT, then a full build on the SUT
  is required (or, in the case of a true cross-compile which is
  documented as such, then a single "runspec -a build" is required on a
  single host.)  Although full builds add to the cost of benchmarking,
  in some instances a full build in a single runspec may be the only
  way to ensure that results will be reproducible.


2.1 General Rules for Optimizations

  The following rules apply to compiler flag selection for SPEC OMP2001
  Peak and Base Metrics.  Additional rules for Base Metrics follow in
  section 2.2. 

2.1.1  Limitations on library substitutions

         Flags which substitute pre-computed (e.g. library-based)
         routines for routines defined in the benchmark on the basis of
         the routine's name are not allowed.  Exceptions are:

          a) the function "alloca".  It is permitted to use a flag that
             substitutes the system's "builtin_alloca" for any C
             benchmark.

          b) the netlib-interface-compliant level 1, 2 and 3 BLAS 
             funcions, LAPACK functions, and FFT functions. Such   
             substitution shall only be acceptable in a peak run, 
             not in base. 

2.1.2  Feedback directed optimization is allowed.  

         Only the training input (which is automatically selected by
         runspec) may be used for the run that generates the feedback
         data.  

         For peak runs, optimization with multiple feedback runs is 
         also allowed.
  
         The requirement to use only the train data set at compile time
         shall not be taken to forbid the use of run-time dynamic
         optimization tools that would observe the reference execution
         and dynamically modify the in-memory copy of the benchmark. 
         However, such tools would not be allowed to in any way affect
         later executions of the same benchmark (for example, when
         running multiple times in order to determine the worst run
         time).   Such tools would also have to be disclosed in the
         submission of a result, and would have to be used for the
         entire suite (see section 3.3).
  
2.1.3 Limitations on size changes

         Flags that change a data type size to a size different from
         the default size of the compilation system are not allowed. 
         Exceptions are: a) C long can be 32 or greater bits, b)
         pointer sizes can be set different from the default size. 

2.2 Base Optimization Rules

  In addition to the rules listed in section 2.1 above, the selection of
  optimizations to be used to produce SPEC OMP2001 Base Metrics includes
  the following: 

  2.2.1  Safe

         The optimization options used are expected to be safe, and it
         is expected that system or compiler vendors would endorse the
         general use of these options by customers who seek to achieve
         good application performance. 

         If a compiler optimization eliminates the user-visible effects
         of the conversions from "double" to "float" values required by the
         C standard (6.3.1.5 in ANSI C99) such an optimization is not
         considered safe.

  2.2.2  Same for all 

         The same compiler and same set of optimization flags or options
         is used for all benchmarks of a given language within a
         benchmark suite.  All flags must be applied in the same order
         for all benchmarks.  The config.html file covers how to set this up
         with the SPEC tools.

         Specifically, benchmarks that are written in  
         Fortran-90 may not use a different set of flags or a different
         compiler invocation in a base run (In a peak run, it is
         permissible to use different compiler commands.)

  2.2.3   Feedback directed optimization is allowed in base.  

          The allowed steps are:
 
             PASS1:        compile the program
 
             Training run: run the program with the train data set
 
             PASS2:        re-compile the program, or invoke a tool that
                           otherwise adjusts the program, and which uses
                           the observed profile from the training run.
 
         PASS2 is optional.  For example, it is conceivable that a
         daemon might optimize the image automatically based on the
         training run, without further tester intervention.  Such a
         daemon would have to be noted in the full disclosure to SPEC.
 
         It is acceptable to use the various fdo_* hooks to clean up
         the results of previous feedback compilations.  The preferred
         hook is fdo_pre0 -- for example:
  
               fdo_pre0 = rm /tmp/prof/*Counts*
  
         Other than such cleanup, no intermediate processing steps may
         be performed between the steps listed above.  If additional
         processing steps are required, the optimization is allowed for
         peak only but not for base.
 
         When a two-pass process is used, the flag(s) that explicitly
         control(s) the generation or the use of feedback information
         can be - and usually will be - different in the two 
         compilation passes.  For the other flags, one of the two 
         conditions must hold:
 
          (1) The same set of flags are used for both invocations of
              the compiler/linker.  For example:
 
                PASS1_CFLAGS= -gen_feedback -fast_library -opt1 -opt2 
                PASS2_CFLAGS= -use_feedback -fast_library -opt1 -opt2 
 
          (2) The set of flags in the first invocation are a subset
              of the flags used in the second.  For example:
 
                PASS1_CFLAGS= -gen_feedback -fast_library
                PASS2_CFLAGS= -use_feedback -fast_library -opt1 -opt2 

  2.2.4  Assertion flags may NOT be used in base.  

         An assertion flag is one that supplies semantic information 
         that the compilation system did not derive from the source 
         statements of the benchmark.
        
         With an assertion flag, the programmer asserts to the compiler
         that the program has certain nice properties that allow the
         compiler to apply more aggressive optimization techniques (for
         example, that there is no aliasing via C pointers).  The
         problem is that there can be legal programs (possibly strange,
         but still standard-conforming programs) where such a property
         does not hold.  These programs could crash or give incorrect
         results if an assertion flag is used.  This is the reason why
         such flags are sometimes also called "unsafe flags".  Assertion
         flags should never be applied to a production program without
         previous careful checks; therefore they are disallowed for
         base.  

  2.2.5  Floating point reordering allowed

         Base results may use flags which affect the numerical accuracy
         or sensitivity by reordering floating-point operations based on
         algebraic identities.  In addition, any reordering due to
         parallel calculations finishing in different order are
         permitted, e.g. reductions can be done in any order if
         done in parallel.
        

  2.2.6 Portability flags 

         Portability flags may be required for some benchmarks. If a
         portability flag is applied to a single benchmark, it
         is considered a portability flag if, and only if, one of the
         following two conditions hold:
 
         (a) The flag is necessary for the successful compilation and
             correct execution of the benchmark regardless of any or all
             compilation flags used.  That is, if it is possible to
             build and run the benchmark without this flag, then this
             flag is not considered a portability flag.

         (b) The benchmark is discovered to violate the ANSI standard, 
             and the compilation system needs to be so informed in
             order to avoid incorrect optimizations.  

             For example, if a benchmark fails with
                         -O4
             due to a standard violation, but works with either
                         -O0
             or
                         -O4 -noansi_alias
             then it would be permissible to use -noansi_alias as a 
             portability flag.  

         Proposed portability flags are subject to scrutiny by SPEC
         HPG.  The initial submissions for OMP2001 will
         include a reviewed set of portability flags on several
         operating systems; later submitters who propose to apply
         additional portability flags should prepare a justification for
         their use.  If the justification is 2.2.6(b), please include
         a specific reference to the offending source code module and
         line number, and a specific reference to the relevant sections
         of the appropriate ANSI standard. 
  
         SPEC always prefers to have benchmarks obey the standard, and
         SPEC attempts to fix as many violations as possible before
         release of the suites.  But it is recognized that some
         violations may not be detected until years after a suite is
         released.  In such a case, a portability switch may be the
         practical solution.  Alternatively, the subcommittee may
         approve a source code fix.

         For a given portability problem, the same flag(s) must be
         applied to all affected benchmarks. 

         If a library is specified as a portability flag, SPEC may
         request that the table of contents of the library be included
         in the disclosure.

  2.2.7  Cannot use names

         No source file or variable or subroutine name or function name
         may be used within an optimization flag or compiler option.

  2.2.7.1  Exceptions
	The following parameters in 330.art_m and 331.art_l are allowed 
        to be set to any value by the submitter, usually using -Dparam=value:

		INTS_PER_CACHELINE
		DBLS_PER_CACHELINE




2.3 Peak Optimizations and Permitted Source Code Changes

  SPEC OMP allows source code modifications for peak runs.  Changes 
  to the directives and source are permitted to facilitate generally 
  useful and portable optimizations, with a focus on improving 
  scalability. Changes in algorithm are not permitted. Vendor 
  unique extensions to OpenMP are allowed, if they are portable.
  
  Examples of compiler flags that are allowed are as follows:
    
    a) Use of subroutine name or function name (e.g. Inlining.)
    b) Different flags are permitted for each program.  (Base
	allows only one set of flags to be used for all programs.)

  Qualifications for permitted optimizations include:

    a) ANSI standard compliant optimizations
    b) ISO Fortran and C compliant optimizations
    c) Optimizations that produce valid results on other compilers and 
       architectures

  Examples of permitted source code modifications and optimizations are as
	follows:
    
    a) Loop Reordering
    b) Loops for explicitly touching of memory in a specific order.
    c) Reshaping arrays
    d) Inlining source code
    e) Parallelization of serial sections without substantive algorithm changes.
    f) Vendor specific OpenMP extensions
    g) Modifications to parallel workload and/or memory distribution

  Examples of optimizations or source code modifications that are not permitted
	are as follows:

    a) Changing a direct solver to an iterative solver.
    b) Adding calls to vendor specific subroutines
       i) Recognizing specific algorithms and substituting math library calls
          (A compiler would be allowed to do this automatically)
    c) Vendor unique directives, which are not OpenMP extensions
    d) Language Extensions

  Full source and a written report of the nature and justification
  of the source changes is required with any peak submission
  having source changes.  These reports will be made public on
  the SPEC website.

  Source code added by a vendor is expected to be portable to other 
  compilers and architectures.  In particular, source code is required 
  to run on at least one other compiler/run-time library/architecture
  other than the platform of the vendor.

  All source code changes are subject to review by the HPG committee.

  Source code modifications are protected by a 6 week publication window.
  That is, a period of 6 weeks after the publication of results based on a set 
  of source code changes during which results based on the same source code 
  modification or technique not approved by the tester may not be published.



3. Running SPEC OMP2001

3.1 System Configuration  

  3.1.1   File Systems 

  SPEC requires the use of a single file system to contain the
  directory tree for the SPEC OMP2001 suite being run.  SPEC allows any
  type of file system (disk-based, memory-based, NFS, DFS, FAT, NTFS
  etc.) to be used.  The type of file system must be disclosed in
  reported results. 

  3.1.2   System State 

  The system state (multi-user, single-user, init level N) may be
  selected by the tester.  This state along with any changes in the
  default configuration of daemon processes or system tuning parameters
  must be documented in the notes section of the results disclosure.
  (For Windows NT, system state is normally "Default"; a list of
  services that are shut down should be provided, if any, e.g.
  networking service shut down) 


3.2 Continuous Run Requirement  
 
  All benchmark executions, including the validations steps,
  contributing to a particular result page must occur continuously, that
  is, in one execution of runspec.  
 
3.3 Run-time environment

  3.3.1 General Run-time environment rules

  SPEC does not attempt to regulate the run-time environment for the
  benchmarks, other than to require that the environment be:
 
       (a) set prior to runspec and consistent throughout the run, with 
           the exception of certain user environment modifications during 
           peak runs described in 3.3.2.
       (b) fully described in the submission, and
       (c) in compliance with section 1, "Philosophy".  

  For example, if each of the following:
  
          run level:   single-user 
          OS tuning:   bigpages=yes, cpu_affinity=hard
          file system: in memory
  
  were set prior to the start of runspec, unchanged during the run,
  described in the submission, and documented and supported by a vendor
  for general use, then these options could be used in a OMP2001
  submission.
   
  Note: Item (a) is intended to forbid all means by which a tester might
  change the environment.  In particular, it is forbidden to change the
  environment during the run using the config file hooks such as
  monitor_pre_bench.  Those hooks are intended for use when studying
  the benchmarks, not for actual submissions.

  3.3.2 Run-time environment set during peak runs.

  For a peak run, the environment may be changed, subject to the 
  following constraints:
 
     - The environment change must be available to general users.

     - The environment change must be accomplished using the SPEC-
       provided config file hooks (such as fdo_pre0).
 
     - The environment change must be fully disclosed to SPEC (see
       section 4).
 
     - The environment change must be compatible with a Continuous 
       Run (see section 3.2).
 
     - The environment change must be accomplished using simple shell
       commands (such as "env OMP_NUM_THREADS=6") as in section 2.0.5.

3.4 Basepeak

  If a result page will contain both peak and base OMP2001 results, a
  single runspec invocation must have been used to run both the peak and
  base executables for each benchmark and their validations.  The tools
  will ensure that the base executables are run first, followed by the
  peak executables.
 
  It is permitted to:
  
    o Publish a base-only run as both base and peak.  This is 
      accomplished by setting the config file flag "basepeak=yes" on a
      global basis.  When the SPEC tools determine that basepeak is set
      for an entire suite (that is, for all the medium benchmarks or
      for all the large size benchmarks), the peak runs will be
      skipped and base results will be reported as both base and peak.
  
    o Force the same result to be used for both base and peak for one 
      or more individual benchmarks.  This is accomplished by setting
      the config file flag "basepeak=yes" for the desired benchmark(s). 
      In this case, the identical executable will be run for both base
      and peak, and the worst will be computed for both.  The lesser
      result will then be reported for both base and peak.  The reason
      this feature exists is simply to clarify for the reader that an
      identical executable was used in both runs, and avoid confusion
      that might otherwise arise from run-to-run variation.
 
  Notes: 
 
    1. It is permitted but not required to compile in the same runspec 
       invocation as the execution.  See rule 2.0.6 regarding cross
       compilation.
 
    2. It is permitted but not required to run both the medium suite 
        and the large suite in a single invocation of runspec.  





4. Results Disclosure

  SPEC requires a full disclosure of results and configuration details
  sufficient to reproduce the results.  SPEC also requires that base
  results be submitted whenever peak results are submitted. Peak or
  base results published outside of the SPEC web site  (www.spec.org),
  in a publicly available medium, and not reviewed  by SPEC are either
  estimates or under review, and must be labeled as such.  Publication
  of results under non-disclosure or company internal use or company
  confidential are not "publicly" available. 
  
  A full disclosure of results will typically include:

  - The components of the disclosure page, as generated by the SPEC
    tools.  

  - The tester's configuration file and any supplemental files needed to 
    build the executables used to generate the results.         

  - A flags definition disclosure.

  A full disclosure of results should include sufficient information to
  allow a result to be independently reproduced.  If a tester is aware
  that a configuration choice affects performance, then s/he should
  document it in the full disclosure.
  
  Note: this rule is not meant to imply that the tester must describe
  irrelevant details or provide massively redundant information.  For
  example, if the SuperHero Model 1 comes with a write-through cache,
  and the SuperHero Model 2 comes with a write-back cache, then
  specifying the model number is sufficient, and no additional steps
  need to be taken to document the cache protocol.  But if the Model 3
  is available with both write-through and write-back caches, then a
  full disclosure must specify which cache is used.

  For information on how to submit a result to SPEC, contact the SPEC
  office.  Contact information is maintained at the SPEC web site,
  www.spec.org


4.1 Rules regarding availability date and systems not yet shipped
 
  If a tester submits results for a hardware or software configuration
  that has not yet shipped, the submitting company must:
 
    - have firm plans to make all components generally available within 
      3 months, to the day, of the first public release of the result 
      (either by the tester or by SPEC, whichever is first)
 
    - specify the availability dates that are planned
  
  "Generally available" means that the product can be ordered by 
  ordinary customers, ships in a reasonable period after orders are 
  submitted, and at least one customer has received it.  (The term
  "reasonable period" is not specified in this paragraph, because it
  varies with the complexity of the system.  But it seems likely that
  a reasonable period for a $500 machine would probably be measured in
  minutes; a reasonable period for a $5,000,000 machine would probably
  be measured in months.)
 
  It is acceptable to test larger configurations than customers are 
  currently ordering, provided that the larger configurations can be
  ordered and the company is prepared to ship them.  For example, if 
  the SuperHero is available in configurations of 1 to 1000 CPUs, but
  the largest order received to date is for 128 CPUs, the tester would
  still be at liberty to test a 1000 CPU configuration and submit the
  result.
 
  A beta release of a compiler (or other software) can be used in a 
  submission, provided that the performance-related features of the
  compiler are committed for inclusion in the final product.  The tester
  should practice due diligence to ensure that the tests do not use an
  uncommitted prototype with no particular shipment plans.  An example
  of due diligence would be a memo from the compiler Project Leader
  which asserts that the tester's version accurately represents the
  planned product, and that the product will ship on date X. 
 
  The general availability date for software is either the committed
  customer shipment date for the final product, or the date of the beta,
  provided that all three of the following conditions are met:

     1. The beta is open to all interested parties without restriction.
        For example, a compiler posted to the web for general users to
        download, or a software subscription service for developers, would
        both be acceptable.
 
     2. The beta is generally announced.  A secret test version is not
	acceptable.

     3. The final product has a committed date for general availability, no
        greater than 3 months after the first public release of the result.
	
  SPEC is aware that performance results published for systems that 
  have not yet shipped may sometimes be subject to change, for example
  when a last-minute bugfix reduces the final performance.  If something
  becomes known that reduces performance by more than 2.75% on an
  overall metric (for example, SPECompLbase2001 or SPECompLpeak2001),
  SPEC requests that the result be resubmitted.


4.2 Configuration Disclosure 

  The following sections describe the various elements that make up the
  disclosure for the system and test configuration used to produce a
  given test result.  The SPEC tools used for the benchmark allow
  setting this information in the configuration file: 
  

  4.2.1 System Identification

  o System Manufacturer
  o System Model Name
  o SPEC license number
  o Test Sponsor (Name, Location)
  o Test Date (Month, Year)
  o Hardware Availability Date
  o Software Availability Date
  

  4.2.2 Hardware Configuration

  o CPU (Processor Name)
  o CPU MHz
  o FPU
  o Number of CPUs in System
  o Number of CPUs orderable
  o Level 1 Cache (Size and Organization)
  o Level 2 Cache (Size and Organization)
  o Other Cache (Size and Organization)
  o Memory (Size in MB/GB)
  o Disk (Size (MB/GB), Type (SCSI, Fast SCSI etc.)
  o Other Hardware 
    (Additional equipment added to improve performance, special disk 
     controller, NVRAM file system accelerator etc.)
   
   
  4.2.3 Software Configuration

  o Operating System (Name and Version)
  o System State (e.g. Single User, Multi-user, Init 3, Default) 
  o File System Type
  o Compilers:
        - C Compiler (Name and Version) 
        - Fortran Compiler(s) (Name and Version) 
        - Pre-processors (Name and Version) if used
  o Whether the benchmarks are automatically optimized to run 
    in parallel over multiple CPUs
  o Other Software 
    (Additional software added to improve performance)


  4.2.4 Tuning Information

  o Description of System Tuning
    (Includes any special OS parameters set, changes to standard 
    daemons (services for Windows NT))
  o Base flags list
  o Portability flags used for any benchmark
  o Peak flags list for each benchmark
  o Base environment variable list
  o Peak environment variable list for each benchmark
  o Any additional notes such as listing any HPG approved alternate
    sources or SPEC tool changes used.

  SPEC is aware that sometimes the spelling of compiler switches, or
  even the presence of compiler switches, changes between beta releases
  and final releases.  For example, suppose that during a compiler beta
  the tester specifies:
  
        f90 -fast -architecture_level 3 -unroll 16
 
  but the tester knows that in the final release the architecture level
  will be automatically set by -fast, and the compiler driver is going
  to change to set the default unroll level to 16.  In that case, it
  would be permissible to mention only -fast in the notes section of the
  full disclosure The tester is expected to exercise due diligence
  regarding such flag reporting, to ensure that the disclosure correctly
  records the intended final product.  An example of due diligence would
  be a memo from the compiler Project Leader which promises that the
  final product will spell the switches as reported.  SPEC may request
  that such a memo be generated and that a copy be provided to SPEC.

4.3 Test Results Disclosure

  The actual test results consist of the elapsed times and ratios for
  the individual benchmarks and the overall SPEC metric produced by
  running the benchmarks via the SPEC tools.  The required use of the
  SPEC tools ensures that the results generated are based on benchmarks
  built, run, and validated according to the SPEC run rules.  Below is a
  list of the measurement components for each SPEC OMP2001 suite and
  metric: 

  4.3.1   Metrics

  o OMPM2001 Metrics: SPECompMbase2001  (Required Base result)
                      SPECompMpeak2001  (Optional Peak result)
                      SPECompM2001      (Greater of Base and Peak Result)

  o OMPL2001 Metrics: SPECompLbase2001  (Required Base result)
                      SPECompLpeak2001  (Optional Peak result)
                      SPECompL2001      (Greater of Base and Peak Result)

  The elapsed time in seconds for each of the benchmarks in the OMPM2001
  or OMPL2001 suite is given and the ratio to the reference machine (SGI
  2100) is calculated.  The SPECompMbase2001 and SPECompLbase2001 metrics
  are calculated as a Geometric Mean of the individual ratios, where each
  ratio is based on the worse execution time of 2 runs, or the median
  from any number of runs greater than 2.  All runs of a specific
  benchmark when using the SPEC tools are required to have validated correctly.
                                
  The benchmark executables must have been built according to the
  rules described in section 2 above.  


4.4  Metric Selection

  Submission of peak results are considered optional by SPEC, so the 
  tester may choose to submit only base results.  Since by definition
  base results adhere to all the rules that apply to peak results, the
  tester may choose to refer to these results by either the base or
  peak metric names (e.g. SPECompMbase2001 or SPECompMpeak2001).

  It is permitted to publish base-only results.  Alternatively, the use
  of the flag "basepeak" is permitted, as described in section 3.4.

4.5 Research and Academic usage of OMP2001

  SPEC encourages use of the OMP2001 suites in academic and research
  environments.  It is understood that experiments in such environments
  may be conducted in a less formal fashion than that demanded of
  hardware vendors submitting to the SPEC web site.  For example, a
  research environment may use early prototype hardware that simply
  cannot be expected to stay up for the length of time required to meet
  the Continuous Run requirement (see section 3.2), or may use research
  compilers that are unsupported and are not generally available (see 
  section 1).
 
  Nevertheless, SPEC would like to encourage researchers to obey as many
  of the run rules as practical, even for informal research.  SPEC
  respectfully suggests that following the rules will improve the
  clarity, reproducibility, and comparability of research results.
 
  Where the rules cannot be followed, SPEC urges that the results be
  clearly distinguished from results officially submitted to SPEC, by:
 
     - disclosing the deviations from the rules, and 
 
     - reporting results in terms of execution times rather than in 
       terms of SPEC's derived metrics (SPECompMbase2001, SPECompMpeak2001,
       etc.), 
 
  It is especially important to clearly distinguish results that do not
  comply with the run rules when the areas of non-compliance are major,
  such as not using the reference workload, or only being able to
  correctly validate a subset of the benchmarks.

  SPEC may post research reports for simulated systems, future
  systems, and research software.  All posted reports will be
  peer reviewed and will be allowed the use of OMP2001 benchmarks
  and metrics.  Research results will not be posted on the OMP2001
  results page; the results page is for formal results submissions
  only.
 
  Results posted as a research paper cannot be cited in any 
  product literature except as an estimated result and must be
  be clearly marked as an estimate with a citation of the 
  paper on the SPEC website.

  Additional guidelines for academic and research publications may be 
  found in the HPG section of the SPEC website (www.spec.org/hpg).

4.6 Fair Use  
 
  Consistency and fairness are guiding principles for SPEC. To help
  assure that these principles are met, any organization or individual
  who makes public use of SPEC benchmark results must do so in accordance
  with the SPEC Fair Use Rule, as posted at http://www.spec.org/fairuse.html.




5.  Run Rule Exceptions 

  If for some reason, the tester cannot run the benchmarks as
  specified in these rules, the tester can seek SPEC HPG approval
  for performance-neutral alternatives.  No publication may be done
  without such approval.  HPG maintains a Policies and Procedures 
  document that defines the procedures for such exceptions.