SPEC OMP2012 Benchmark Description File

Benchmark Name


Benchmark Author

S. Weeratunga, V. Venkatakrishnan, E. Barszcz, M. Yarrow, H. Jin

Benchmark Program General Category

Computational Fluid Dynamics and Computational Physics

Benchmark Description

Solution of five coupled nonlinear PDE's, on a 3-dimensional logically structured grid, using an implicit psuedo-time marching scheme, based on two-factor approximate factorization of the sparse Jacobian matrix. This scheme is functionally equivalent to a nonlinear block SSOR iterative scheme with lexicographic ordering. Spatial discretization of the differential operators are based on second-order accurate finite volume scheme. Insists on the strict lexicographic ordering during the solution of the regular sparse lower and upper triangular matrices. As a result, the degree of exploitable parallelism during this phase is limited to O(N**2) as opposed to O(N**3) in other phases and it's spatial distribution is non-homogenous. This fact also creates challenges during the loop re-ordering to enhance the cache locality. This version is derived from the NPB 3.3.1 benchmark suite.

Input Description

Input parameters are supplied in the file inputlu.data.

  1. ipr = 1 for detailed progress output
  2. inorm = how often the norm is printed (once every inorm iterations)
  3. itmax = number of pseudo time steps
  4. dt = time step
  5. omega 1 over-relaxation factor for SSOR
  6. tolrsd = steady state residual tolerance levels
  7. nx, ny, nz = number of grid points in x, y, z directions

The reference dataset uses a grid of 104x1026x1026, tolerances of 1.0e-08, omega of 1.2, time step of 1, 100 time steps and norm every 100. The train dataset uses a grid of 64x64x64, tolerances of 1.0e-08, omega of 1.2, time step of 2, 100 time steps and norm every 100. The test dataset uses a grid of 48x48x64, tolerances of 1.0e-08, omega of 1.2, time step of 2, 250 time steps and norm every 250.

Output Description

The program is capable of automatically verifying whether a given run conforms to the specification of the benchmark by using internally stored reference solutions. However, these reference solutions are available only for a fixed number of mesh size/time steps pairs. If the input data does not correspond to any of the internally stored reference solutions, the verification test is not performed. Otherwise, the output indicates whether or not the run was successfull in meeting the requirements of the verifications tests. To conform to the specification of the benchmark, a run should successfully pass all three verification tests. Failure in any one or more tests indicates non-conformance with the specifications.

Programming Language


Known portability issues



Bailey, D.; Harris, T.; Saphir, W.; van der Wijngaart, R.; Woo, A.; Yarrow, M. (December 1995), "The NAS Parallel Benchmarks 2.0", NAS Technical Report NAS-95-020, NASA Ames Research Center, Moffett Field, CA

Jin, H.; Frumkin, M.; Yan, J. (October 1999), "The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance", NAS Technical Report NAS-99-011, NASA Ames Research Center, Moffett Field, CA

Last update: January 31, 2012