865.roms_s
SPEC CPU®2026 Benchmark Description

Benchmark Name

865.roms_s

Benchmark Program General Category

ROMS is a Regional Ocean Modeling System

Benchmark Authors

The Regional Ocean Modeling System (ROMS) is written and maintained by "The ROMS/TOMS Group", which is described as:

ROMS: Regional Ocean Modeling System
TOMS: Terrain-following Ocean Modeling System

865.roms_s was submitted to the SPEC CPU v8 Benchmark Search Program by Hans Joraandstad hans [dot] joraandstad [at] gmail [dot] com.

Benchmark Description

ROMS is a free-surface, terrain-following, primitive equations ocean model. It is coded in parallel for multi-threaded and multi-process architectures. ROMS has been used for applications from planetary scales down to local estuarine environments. Forecasts include predictions of water temperature, ocean currents, salinity, and sea surface height.

Input Description

865.roms_s uses the generic BENCHMARK data set.
TITLE = Benchmark Test, Idealized Southern Ocean, Small Grid

The files used are:

  varinfo.yaml          Model description, used by all workloads.
                        Sets up over 600 variables.

  roms_benchmarkN.in.x  where N=[0123] representing test, train,
                        refrate, and refspeed.

The roms_benchmarkN.in.x files are read from stdin. They are preprocessed for the OpenMP threaded version to find an optimal and legal set of values for tiling of the I and J dimensions, resulting in files roms_benchmarkN.in

To create larger or smaller datasets the following parameters in the roms_benchmarkN.in.x files can be used. The example below is from the very short "test" workload:

     Lm == 512        ! Number of I-direction INTERIOR RHO-points
     Mm == 64         ! Number of J-direction INTERIOR RHO-points
      N == 30         ! Number of vertical levels
 NTIMES == 2          ! Total number time-steps in current run.
     DT == 150.0d0    ! Time-Step size in seconds

The 3 first parameters affect both size (memory) and run time for the benchmark. The above settings for Lm, Mm, and N are used for both the test and train workloads, which consume about 372 MiB of memory on a Linux system tested by SPEC.

NTIMES affects only run time. The test workload does only 2 steps because the test workload is intended merely to verify that a valid binary has been built and that it can open its files. All the other workloads (train, refrate, refspeed) have adjusted for NTIMES according to SPEC's desired run times during pre-release testing.

For the refrate workload (which is used by the SPECrate version), these values are used, and memory increases to about 1.2 GiB on the same Linux system:

     Lm == 1024       ! Number of I-direction INTERIOR RHO-points
     Mm == 128        ! Number of J-direction INTERIOR RHO-points
      N == 30         ! Number of vertical levels

The refspeed workload (for the SPECspeed version of the benchmark) uses the values shown below. When tested with a single thread it used about 22 GiB rss (resident size) and about the same vsz (virtual size). When many OpenMP threads were used, rss increased to about 24 GiB and vsz to about 30 GiB. Your memory needs will vary depending on your compiler, OpenMP library, number of threads, and your setting for environment variable OMP_STACKSIZE.

     Lm == 8192       ! Number of I-direction INTERIOR RHO-points
     Mm == 256        ! Number of J-direction INTERIOR RHO-points
      N == 40         ! Number of vertical levels

Optional debugging information: SPEC builds include a -DNDEBUG option. If this is omitted the output file will include information on tiling, thread and memory usage information, for example:

Resolution, Grid 01: 512x64x30,  Parallel Threads: 4,  Tiling: 4x4

Tile partition information for Grid 01:  512x64x30  tiling: 4x4
 tile     Istr     Iend     Jstr     Jend     Npts
    0        1      128        1       16    61440
    1      129      256        1       16    61440
    ...
   15      385      512       49       64    61440

Tile minimum and maximum fractional coordinates for Grid 01:
 tile     Xmin     Xmax     Ymin     Ymax     grid
    0    -1.50   515.50     0.50    66.50  RHO-points
    1    -2.50   515.50     0.50    66.50  RHO-points
    ...
   15    -2.50   514.50    -0.50    65.50  RHO-points
    0    -2.00   515.50     0.50    66.50    U-points
    ...
   15    -2.50   514.00    -0.50    65.50    U-points
    0    -1.50   515.50     0.00    66.50    V-points
    ...
   15    -2.50   514.50    -0.50    65.00    V-points

 Dynamic and Automatic memory (MB) usage for Grid 01:  512x64x30  tiling: 4x4
 tile          Dynamic        Automatic            USAGE
    0           301.96            16.83           318.79
    ...
   15             0.00            16.83            16.83
TOTAL           301.96           269.22           571.19

Output Description

After each time step, various calculated energies and volume are printed. After the run, these are validated against a SPEC-supplied set of expected answers. Note that all of the values printed in the table after each step are calculated. The fact that 3 of the 4 values printed are the same on each step is an expected behavior.

The output is written to stdout, which is redirected to: roms_benchmarkN.log where N is again 0, 1, 2, or 3.

Programming Language

Fortran

Threading Model

The SPECrate version is single-threaded.

The SPECspeed version uses OpenMP. (Note: although the ROMS itself supports MPI, the SPEC CPU version does not use MPI.)

Known Portability Issues

ROMS is highly portable. It is a modular code written in modern Fortran. It uses C-preprocessing (for SPEC CPU: specpp) to activate the various physical and numerical options. Several coding standards have been established to facilitate model readability, maintenance, and portability.

As mentioned in system-requirements.html#memory, you may need to adjust your stack size(s). For the SPECrate version 765.roms_r, Linux and Unix users may want to set
ulimit -s unlimited
and users of the Intel Compiler on Windows might need compiler options such as
/F1800000000
to adjust the main process stack size. For the SPECspeed version 865.roms_s, you may need to adjust the main process stack (as shown just above); and you may also need to adjust the stack size for child processes using OMP_STACKSIZE and preenv. The SPEC CPU 2017 FAQ item for 627.cam4 may be helpful to understand how to set the stack sizes.

Sources and Licensing

865.roms_s is based on https://github.com/myroms/roms tag: roms-4.1.

ROMS uses an MIT license, as found in License_ROMS.txt (link goes to the local copy in the benchmark Docs/ directory). In August 2023 the file was renamed, but accessing tag roms-4.1 allows it to be found via the original name License_ROMS.txt (GitHub copy).

Differences from SPEC CPU 2017

The SPEC CPU 2026 benchmark 865.roms_s differs from the SPEC CPU 2017 benchmark versions 554.roms_r and 654.roms_s:

The CPU 2017 benchmark was based on ROMS 3.2 (2009); the CPU 2026 version is based on ROMS 4.1 (2023). Many algorithms have been refined or rewritten. The code has been updated to more modern and portable Fortran. It is more modular, making it easier to create new applications by changes to a minimal number of source files.
Workload sizes and run times have been modified to match the requirements for SPEC CPU 2026

865.roms_s SPEC CPU®2026 Benchmark Description