SPEC ACCEL PGI - Platform settings

Shell, Environment, and Other Software Settings

Shell, Environment, and Other Software Settings - All compiler versions

Set stack size to unlimited
The command "ulimit -s unlimited" is used to set the stack size limit to unlimited.
numactl --interleave=all "runspec command"
Launching a process with numactl --interleave=all sets the memory interleave policy so that memory will be allocated using round robin on nodes. When memory cannot be allocated on the current interleave target fall back to other nodes.
OMP_NUM_THREADS
Sets the maximum number of threads to use for OpenMP parallel regions if no other value is specified in the application. Example syntax on a Linux system with 8 cores: export OMP_NUM_THREADS=8
OMP_STACKSIZE
Specify stack size to be allocated for each thread.
ACC_NUM_CORES
Sets the maximum number of CPU cores to use. Default is to use all available physical cores.
HUGETLB_PATH
Set the huge TLB pages mount path.

Shell, Environment, and Other Software Settings - PGI LLVM Compiler - x86 and Power architectures.

OMP_PROC_BIND
If set to 'true', bind CPU threads to cores. Default is 'false'.
OMP_PLACES
Set the thread to core number binding.
Example: OMP_PLACES={0},{1},{2},{3}
KMP_AFFINITY
Syntax: KMP_AFFINITY=[<modifier>,...]<type>[,<permute>][,<offset>]
The value for the environment variable KMP_AFFINITY affects how the threads from an auto-parallelized program are scheduled across processors.
It applies to binaries built with -mp using the PGI LLVM compilers on x86 and Power.
modifier:
    granularity=fine Causes each OpenMP thread to be bound to a single thread context.
type:
    compact Specifying compact assigns the OpenMP thread <n>+1 to a free thread context as close as possible to the thread context where the <n> OpenMP thread was placed.
    scatter Specifying scatter distributes the threads as evenly as possible across the entire system.
permute: The permute specifier is an integer value controls which levels are most significant when sorting the machine topology map. A value for permute forces the mappings to make the specified number of most significant levels of the sort the least significant, and it inverts the order of significance.
offset: The offset specifier indicates the starting position for thread assignment.

Example: KMP_AFFINITY=granularity=fine,scatter
Specifying granularity=fine selects the finest granularity level and causes each OpenMP or auto-par thread to be bound to a single thread context.
This ensures that there is only one thread per core on cores supporting HyperThreading Technology
Specifying scatter distributes the threads as evenly as possible across the entire system.
Hence a combination of these two options, will spread the threads evenly across sockets, with one thread per physical core.

Example: KMP_AFFINITY=compact,1,0
Specifying compact will assign the n+1 thread to a free thread context as close as possible to thread n.
A default granularity=core is implied if no granularity is explicitly specified.
Specifying 1,0 sets permute and offset values of the thread assignment.
With a permute value of 1, thread n+1 is assigned to a consecutive core. With an offset of 0, the process's first thread 0 will be assigned to thread 0.
The same behavior is exhibited in a multisocket system.

Shell, Environment, and Other Software Settings - PGI Native x86 compilers

MP_BIND
If set to 'yes', bind CPU threads to cores. Default is 'no'.
MP_BLIST
Set the thread to core number binding.
Example: MP_BIND=0,1,2,3
MP_SPIN
Specifies the number of times to check a semaphore before calling sched_yield() (on Linux or macOS) or _sleep() (on Windows).