ACCEL Flag Description
Supermicro SuperServer 1028GR-TR

Test sponsored by NVIDIA Corporation


Base Compiler Invocation

C benchmarks

Fortran benchmarks

Benchmarks using both Fortran and C


Base Optimization Flags

C benchmarks

Fortran benchmarks

Benchmarks using both Fortran and C

353.clvrleaf

359.miniGhost


Peak Optimization Flags

C benchmarks

303.ostencil

304.olbm

314.omriq

352.ep

354.cg

357.csp

370.bt

Fortran benchmarks

350.md

351.palm

355.seismic

356.sp

360.ilbdc

363.swim

Benchmarks using both Fortran and C

353.clvrleaf

359.miniGhost


Implicitly Included Flags

This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.


Shell, Environment, and Other Software Settings

Shell, Environment, and Other Software Settings - All compiler versions

Set stack size to unlimited
The command "ulimit -s unlimited" is used to set the stack size limit to unlimited.
numactl --interleave=all "runspec command"
Launching a process with numactl --interleave=all sets the memory interleave policy so that memory will be allocated using round robin on nodes. When memory cannot be allocated on the current interleave target fall back to other nodes.
OMP_NUM_THREADS
Sets the maximum number of threads to use for OpenMP parallel regions if no other value is specified in the application. Example syntax on a Linux system with 8 cores: export OMP_NUM_THREADS=8
OMP_STACKSIZE
Specify stack size to be allocated for each thread.
ACC_NUM_CORES
Sets the maximum number of CPU cores to use. Default is to use all available physical cores.
HUGETLB_PATH
Set the huge TLB pages mount path.

Shell, Environment, and Other Software Settings - PGI LLVM Compiler - x86 and Power architectures.

OMP_PROC_BIND
If set to 'true', bind CPU threads to cores. Default is 'false'.
OMP_PLACES
Set the thread to core number binding.
Example: OMP_PLACES={0},{1},{2},{3}
KMP_AFFINITY
Syntax: KMP_AFFINITY=[<modifier>,...]<type>[,<permute>][,<offset>]
The value for the environment variable KMP_AFFINITY affects how the threads from an auto-parallelized program are scheduled across processors.
It applies to binaries built with -mp using the PGI LLVM compilers on x86 and Power.
modifier:
    granularity=fine Causes each OpenMP thread to be bound to a single thread context.
type:
    compact Specifying compact assigns the OpenMP thread <n>+1 to a free thread context as close as possible to the thread context where the <n> OpenMP thread was placed.
    scatter Specifying scatter distributes the threads as evenly as possible across the entire system.
permute: The permute specifier is an integer value controls which levels are most significant when sorting the machine topology map. A value for permute forces the mappings to make the specified number of most significant levels of the sort the least significant, and it inverts the order of significance.
offset: The offset specifier indicates the starting position for thread assignment.

Example: KMP_AFFINITY=granularity=fine,scatter
Specifying granularity=fine selects the finest granularity level and causes each OpenMP or auto-par thread to be bound to a single thread context.
This ensures that there is only one thread per core on cores supporting HyperThreading Technology
Specifying scatter distributes the threads as evenly as possible across the entire system.
Hence a combination of these two options, will spread the threads evenly across sockets, with one thread per physical core.

Example: KMP_AFFINITY=compact,1,0
Specifying compact will assign the n+1 thread to a free thread context as close as possible to thread n.
A default granularity=core is implied if no granularity is explicitly specified.
Specifying 1,0 sets permute and offset values of the thread assignment.
With a permute value of 1, thread n+1 is assigned to a consecutive core. With an offset of 0, the process's first thread 0 will be assigned to thread 0.
The same behavior is exhibited in a multisocket system.

Shell, Environment, and Other Software Settings - PGI Native x86 compilers

MP_BIND
If set to 'yes', bind CPU threads to cores. Default is 'no'.
MP_BLIST
Set the thread to core number binding.
Example: MP_BIND=0,1,2,3
MP_SPIN
Specifies the number of times to check a semaphore before calling sched_yield() (on Linux or macOS) or _sleep() (on Windows).

Flag description origin markings:

[user] Indicates that the flag description came from the user flags file.
[suite] Indicates that the flag description came from the suite-wide flags file.
[benchmark] Indicates that the flag description came from a per-benchmark flags file.

The flags files that were used to format this result can be browsed at
https://www.spec.org/accel/flags/PGI-Platform-Multicore-OMP.html,
https://www.spec.org/accel/flags/pgi2018_flags.html.

You can also download the XML flags sources by saving the following links:
https://www.spec.org/accel/flags/PGI-Platform-Multicore-OMP.xml,
https://www.spec.org/accel/flags/pgi2018_flags.xml.


For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact webmaster@spec.org
Copyright 2015-2018 Standard Performance Evaluation Corporation
Tested with SPEC ACCEL v1.2.
Report generated on Thu Aug 30 18:55:29 2018 by SPEC ACCEL flags formatter v1290.