Platform Setting for Dell Precision Workstations

Operating System Tuning Parameters

KMP_AFFINITY = granularity=fine,scatter

The value for the environment variable KMP_AFFIINTY affects how the threads from an auto-parallelized program are scheduled across processors. Specifying granularity=fine selects the finest granularity level, causes each OpenMP thread to be bound to a single thread context. This ensures that there is only one thread per core on cores supporting HyperThreading Technology. Specifying scatter distributes the threads as evenly as possible across the entire system. Hence a combination of these two options will spread the threads evenly across sockets, with one thread per physical core.

OMP_NUM_THREADS

Sets the maximum number of threads to use for OpenMP* parallel regions if no other value is specified in the application. This environment variable applies to both -openmp and -parallel (Linux and Mac OS X) or /Qopenmp and /Qparallel (Windows).

Example syntax on a Windows system with 8 cores:
set OMP_NUM_THREADS=8


Firmware / BIOS / Microcode Settings

Adjacent Cache Line Prefetch:

This BIOS option allows the enabling/disabling of a processor mechanism to fetch the adjacent cache line within an 128-byte sector that contains the data needed due to a cache line miss.

In some limited cases, setting this option from the Default may improve performance. In the majority of cases, the default setting provides better performance. Users should modify this option after performing application benchmarking to verify improved performance in their environment.

Hardware Prefetch:

This BIOS option allows allows the enabling/disabling of a processor mechanism to prefetch data into the cache according to a pattern recognition algorithm.

In some limited cases, setting this option to Disabled may improve performance. In the majority of cases, the option set to Enabled provides better performance. Users should only disable this option after performing application benchmarking to verify improved performance in their environment.

Hyper-Threading Technology

This BIOS setting disables/enables Hyper-Threading (HT) Technology. HT enables the processor to allocate an additional thread to a core.

Memory Node Interleaving

This BIOS setting when set to NUMA (Non-Uniform Memory Access) configures the system memory into blocks local to each processor. A NUMA-aware operating system can use this configuration to intelligently allocate memory for optimal performance.