SPEC OMP2012 Flag Description for the Intel(R) C++ and Fortran Compiler 13.x for IA32 and Intel 64 applications


Selecting one of the following will take you directly to that section:

Optimization Flags

Portability Flags

Compiler Flags

Commands and Options Used to Submit Benchmark Runs

dplace -x2 $command
dplace is a tool for binding processes to cpus
Here is a brief description of options used in the config file:

Shell, Environment, and Other Software Settings

KMP_LIBRARY = [ throughput | turnaround | serial ]
Selects the OpenMP run-time library execution mode. The options for the variable value are throughput, turnaround, and serial.

The serial mode forces parallel applications to run on a single processor.

In a dedicated (batch or single user) parallel environment where all processors are exclusively allocated to the program for its entire run, it is most important to effectively utilize all of the processors all of the time. The turnaround mode is designed to keep active all of the processors involved in the parallel computation in order to minimize the execution time of a single job. In this mode, the worker threads actively wait for more parallel work, without yielding to other threads.
Avoid over-allocating system resources. This occurs if either too many threads have been specified, or if too few processors are available at run time. If system resources are over-allocated, this mode will cause poor performance. The throughput mode should be used instead if this occurs.

In a multi-user environment where the load on the parallel machine is not constant or where the job stream is not predictable, it may be better to design and tune for throughput. This minimizes the total time to run multiple jobs simultaneously. In this mode, the worker threads will yield to other threads while waiting for more parallel work.
The throughput mode is designed to make the program aware of its environment (that is, the system load) and to adjust its resource usage to produce efficient execution in a dynamic environment. This mode is the default.

Sets the time, in milliseconds, that a thread should wait, after completing the execution of a parallel region, before sleeping.Use the optional character suffixes: s (seconds), m (minutes), h (hours), or d (days) to specify the units.Specify infinite for an unlimited wait time.
Specify stack size to be allocated for each thread.
The value for the environment variable KMP_AFFINITY affects how the threads from an auto-parallelized program are scheduled across processors.
Specifying disabled completely disables the thread affinity interfaces. This forces the OpenMP run-time library to behave as if the affinity interface was not supported by the operating system. This includes the low-level API interfaces such as kmp_set_affinity and kmp_get_affinity, which have no effect and will return a nonzero error code.
Fine tune the load balancing of parallel loops that are statically scheduled under OpenMP with no chunk size specification.
Setting it to "static,balanced" results in (#iterations/#threads) iterations--rounded to the next lower integer--being allocated to most threads, with at most one additional iteration being allocated to some threads. Although the largest number of iterations assigned to any thread remains the same, this results in a more even sharing of iterations between threads, which may sometimes lead to a performance improvement relative to the default static thread distribution.
OMP_DYNAMIC=[ 1 | 0 ] Enables (1) or disables (0) the dynamic adjustment of the number of threads.
Set stack size to unlimited
The command "ulimit -s unlimited" is used to set the stack size limit to unlimited.