hpc2021 Result Flag Description

Base Optimization Flags

C benchmarks

- -fast
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Chooses generally optimal flags for the target platform.
- Includes:
  - -O2
    - -O1
  - -Munroll=c:1
    - -Munroll
  - -Mautoinline
  - -Mlre
  - -Mvect=sse
    - -Mvect
      
      -Mvect=assoc
      
      -Mvect=altcode
  - -Mcache_align
  - -Mflushz
- -acc=gpu
- mpicc,mpicxx,mpif90
- OPTIMIZE
- Enable OpenACC directives targeting NVIDIA GPUs
- -Mfprelaxed
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Instructs the compiler to use relaxed precision in the calculation of some intrinsic functions. Can result in improved performance at the expense of numerical accuracy.
- Includes:
- -Mnouniform
- mpicc, mpicxx,mpif90
- OPTIMIZE
- The numerical method used when computing the residual iterations of a vectorized (SIMD) loop may be different than used in the vectorized loop. Using this option may lead for fast but less numerically consistent results.
- -Mstack_arrays
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Place automatic arrays on the stack.
- -DSPEC_ACCEL_AWARE_MPI
- OPTIMIZE
- Definition of this macro indicates that the MPI implementation supports accelerator device-to-device transfers. Used in conjuction when using OpenACC or OpenMP w/ target offload.

C++ benchmarks

- -fast
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Chooses generally optimal flags for the target platform.
- Includes:
  - -O2
    - -O1
  - -Munroll=c:1
    - -Munroll
  - -Mautoinline
  - -Mlre
  - -Mvect=sse
    - -Mvect
      
      -Mvect=assoc
      
      -Mvect=altcode
  - -Mcache_align
  - -Mflushz
- -acc=gpu
- mpicc,mpicxx,mpif90
- OPTIMIZE
- Enable OpenACC directives targeting NVIDIA GPUs
- -Mfprelaxed
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Instructs the compiler to use relaxed precision in the calculation of some intrinsic functions. Can result in improved performance at the expense of numerical accuracy.
- Includes:
- -Mnouniform
- mpicc, mpicxx,mpif90
- OPTIMIZE
- The numerical method used when computing the residual iterations of a vectorized (SIMD) loop may be different than used in the vectorized loop. Using this option may lead for fast but less numerically consistent results.
- -Mstack_arrays
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Place automatic arrays on the stack.
- -DSPEC_ACCEL_AWARE_MPI
- OPTIMIZE
- Definition of this macro indicates that the MPI implementation supports accelerator device-to-device transfers. Used in conjuction when using OpenACC or OpenMP w/ target offload.

Fortran benchmarks

- -DSPEC_ACCEL_AWARE_MPI
- OPTIMIZE
- Definition of this macro indicates that the MPI implementation supports accelerator device-to-device transfers. Used in conjuction when using OpenACC or OpenMP w/ target offload.
- -fast
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Chooses generally optimal flags for the target platform.
- Includes:
  - -O2
    - -O1
  - -Munroll=c:1
    - -Munroll
  - -Mautoinline
  - -Mlre
  - -Mvect=sse
    - -Mvect
      
      -Mvect=assoc
      
      -Mvect=altcode
  - -Mcache_align
  - -Mflushz
- -acc=gpu
- mpicc,mpicxx,mpif90
- OPTIMIZE
- Enable OpenACC directives targeting NVIDIA GPUs
- -Mfprelaxed
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Instructs the compiler to use relaxed precision in the calculation of some intrinsic functions. Can result in improved performance at the expense of numerical accuracy.
- Includes:
- -Mnouniform
- mpicc, mpicxx,mpif90
- OPTIMIZE
- The numerical method used when computing the residual iterations of a vectorized (SIMD) loop may be different than used in the vectorized loop. Using this option may lead for fast but less numerically consistent results.
- -Mstack_arrays
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Place automatic arrays on the stack.

Peak Optimization Flags

C benchmarks

505.lbm_t

- -fast
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Chooses generally optimal flags for the target platform.
- Includes:
  - -O2
    - -O1
  - -Munroll=c:1
    - -Munroll
  - -Mautoinline
  - -Mlre
  - -Mvect=sse
    - -Mvect
      
      -Mvect=assoc
      
      -Mvect=altcode
  - -Mcache_align
  - -Mflushz
- -acc=gpu
- mpicc,mpicxx,mpif90
- OPTIMIZE
- Enable OpenACC directives targeting NVIDIA GPUs
- -O3
- mpicc, mpicxx,mpif90
- OPTIMIZE
- All level 1 and 2 optimizations are performed. In addition, this level enables more aggressive code hoisting and scalar replacement optimizations that may or may not be profitable.
- Includes:
  - -O2
    - -O1
- -Mfprelaxed
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Instructs the compiler to use relaxed precision in the calculation of some intrinsic functions. Can result in improved performance at the expense of numerical accuracy.
- Includes:
- -Mnouniform
- mpicc, mpicxx,mpif90
- OPTIMIZE
- The numerical method used when computing the residual iterations of a vectorized (SIMD) loop may be different than used in the vectorized loop. Using this option may lead for fast but less numerically consistent results.
- -DSPEC_ACCEL_AWARE_MPI
- OPTIMIZE
- Definition of this macro indicates that the MPI implementation supports accelerator device-to-device transfers. Used in conjuction when using OpenACC or OpenMP w/ target offload.

518.tealeaf_t

- -fast
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Chooses generally optimal flags for the target platform.
- Includes:
  - -O2
    - -O1
  - -Munroll=c:1
    - -Munroll
  - -Mautoinline
  - -Mlre
  - -Mvect=sse
    - -Mvect
      
      -Mvect=assoc
      
      -Mvect=altcode
  - -Mcache_align
  - -Mflushz
- -acc=gpu
- mpicc,mpicxx,mpif90
- OPTIMIZE
- Enable OpenACC directives targeting NVIDIA GPUs
- -Msafeptr
- mpicc, mpicxx
- OPTIMIZE
- Instructs the C/C++ compiler to override data dependencies between pointers of a given storage class.
- -DSPEC_ACCEL_AWARE_MPI
- OPTIMIZE
- Definition of this macro indicates that the MPI implementation supports accelerator device-to-device transfers. Used in conjuction when using OpenACC or OpenMP w/ target offload.

521.miniswp_t

- -fast
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Chooses generally optimal flags for the target platform.
- Includes:
  - -O2
    - -O1
  - -Munroll=c:1
    - -Munroll
  - -Mautoinline
  - -Mlre
  - -Mvect=sse
    - -Mvect
      
      -Mvect=assoc
      
      -Mvect=altcode
  - -Mcache_align
  - -Mflushz
- -acc=gpu
- mpicc,mpicxx,mpif90
- OPTIMIZE
- Enable OpenACC directives targeting NVIDIA GPUs
- -gpu=pinned
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Allocate host data directly in CPU physical (pinned) memory in place of using pinned memory buffers. Allocation cost may be higher, but using pinned memory data transfer is often faster. Useful with programs having few allocation but many data transfers between the host and device.

534.hpgmgfv_t

- -fast
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Chooses generally optimal flags for the target platform.
- Includes:
  - -O2
    - -O1
  - -Munroll=c:1
    - -Munroll
  - -Mautoinline
  - -Mlre
  - -Mvect=sse
    - -Mvect
      
      -Mvect=assoc
      
      -Mvect=altcode
  - -Mcache_align
  - -Mflushz
- -acc=gpu
- mpicc,mpicxx,mpif90
- OPTIMIZE
- Enable OpenACC directives targeting NVIDIA GPUs
- -static-nvidia
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Staticily link with the NVIDIA runtime libraries. System libraries may still be dynamically linked.
- -DSPEC_ACCEL_AWARE_MPI
- OPTIMIZE
- Definition of this macro indicates that the MPI implementation supports accelerator device-to-device transfers. Used in conjuction when using OpenACC or OpenMP w/ target offload.

C++ benchmarks

- -fast
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Chooses generally optimal flags for the target platform.
- Includes:
  - -O2
    - -O1
  - -Munroll=c:1
    - -Munroll
  - -Mautoinline
  - -Mlre
  - -Mvect=sse
    - -Mvect
      
      -Mvect=assoc
      
      -Mvect=altcode
  - -Mcache_align
  - -Mflushz
- -acc=gpu
- mpicc,mpicxx,mpif90
- OPTIMIZE
- Enable OpenACC directives targeting NVIDIA GPUs
- -O3
- mpicc, mpicxx,mpif90
- OPTIMIZE
- All level 1 and 2 optimizations are performed. In addition, this level enables more aggressive code hoisting and scalar replacement optimizations that may or may not be profitable.
- Includes:
  - -O2
    - -O1
- -Mfprelaxed
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Instructs the compiler to use relaxed precision in the calculation of some intrinsic functions. Can result in improved performance at the expense of numerical accuracy.
- Includes:
- -Mnouniform
- mpicc, mpicxx,mpif90
- OPTIMIZE
- The numerical method used when computing the residual iterations of a vectorized (SIMD) loop may be different than used in the vectorized loop. Using this option may lead for fast but less numerically consistent results.
- -Mstack_arrays
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Place automatic arrays on the stack.
- -static-nvidia
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Staticily link with the NVIDIA runtime libraries. System libraries may still be dynamically linked.
- -DSPEC_ACCEL_AWARE_MPI
- OPTIMIZE
- Definition of this macro indicates that the MPI implementation supports accelerator device-to-device transfers. Used in conjuction when using OpenACC or OpenMP w/ target offload.

Fortran benchmarks

535.weather_t

- -DSPEC_ACCEL_AWARE_MPI
- OPTIMIZE
- Definition of this macro indicates that the MPI implementation supports accelerator device-to-device transfers. Used in conjuction when using OpenACC or OpenMP w/ target offload.
- -fast
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Chooses generally optimal flags for the target platform.
- Includes:
  - -O2
    - -O1
  - -Munroll=c:1
    - -Munroll
  - -Mautoinline
  - -Mlre
  - -Mvect=sse
    - -Mvect
      
      -Mvect=assoc
      
      -Mvect=altcode
  - -Mcache_align
  - -Mflushz
- -acc=gpu
- mpicc,mpicxx,mpif90
- OPTIMIZE
- Enable OpenACC directives targeting NVIDIA GPUs
- -O3
- mpicc, mpicxx,mpif90
- OPTIMIZE
- All level 1 and 2 optimizations are performed. In addition, this level enables more aggressive code hoisting and scalar replacement optimizations that may or may not be profitable.
- Includes:
  - -O2
    - -O1
- -Mfprelaxed
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Instructs the compiler to use relaxed precision in the calculation of some intrinsic functions. Can result in improved performance at the expense of numerical accuracy.
- Includes:
- -Mnouniform
- mpicc, mpicxx,mpif90
- OPTIMIZE
- The numerical method used when computing the residual iterations of a vectorized (SIMD) loop may be different than used in the vectorized loop. Using this option may lead for fast but less numerically consistent results.
- -Mstack_arrays
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Place automatic arrays on the stack.
- -static-nvidia
- mpicc, mpicxx,mpif90
- OPTIMIZE
- Staticily link with the NVIDIA runtime libraries. System libraries may still be dynamically linked.

Implicitly Included Flags

This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.

For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact info@spec.org
Copyright 2021-2023 Standard Performance Evaluation Corporation
Tested with SPEC hpc2021 v1.0.3.
Report generated on 2023-03-27 12:20:43 by SPEC hpc2021 flags formatter v1.0.3 .

hpc2021 Flag Description

Test sponsored by xFusion

Compilers: NVHPC SDK

Operating systems: Linux

Base Compiler Invocation

C benchmarks

C++ benchmarks

Fortran benchmarks

Peak Compiler Invocation

C benchmarks

C++ benchmarks

Fortran benchmarks

Base Portability Flags

532.sph_exa_t

Base Optimization Flags

C benchmarks

C++ benchmarks

Fortran benchmarks

Peak Optimization Flags

C benchmarks

505.lbm_t

513.soma_t

518.tealeaf_t

521.miniswp_t

534.hpgmgfv_t

C++ benchmarks

Fortran benchmarks

519.clvleaf_t

528.pot3d_t

535.weather_t

Base Other Flags

C benchmarks

C++ benchmarks

Fortran benchmarks

Peak Other Flags

C benchmarks

C++ benchmarks

Fortran benchmarks

Implicitly Included Flags

	Indicates that the flag description came from the user flags file.
	Indicates that the flag description came from the suite-wide flags file.
	Indicates that the flag description came from a per-benchmark flags file.