AMD Optimizing C/C++ Compiler Suite Version 1.3.0 SPEC CPU2017 Flag Description

aocc130-flags-revA2 AMD Optimizing C/C++ Compiler Suite Version 1.3.0 SPEC CPU2017 Flag Description Using numactl to bind processes and memory to cores

For multi-copy runs or single copy runs on systems with multiple sockets, it is advantageous to bind a process to a particular core. Otherwise, the OS may arbitrarily move your process from one core to another. This can affect performance. To help, SPEC allows the use of a "submit" command where users can specify a utility to use to bind processes. We have found the utility 'numactl' to be the best choice.

numactl runs processes with a specific NUMA scheduling or memory placement policy. The policy is set for a command and inherited by all of its children. The numactl flag "--physcpubind" specifies which core(s) to bind the process. "-l" instructs numactl to keep a process's memory on the local node while "-m" specifies which node(s) to place a process's memory. For full details on using numactl, please refer to your Linux documentation, 'man numactl'

Note that some older versions of numactl incorrectly interpret application arguments as its own. For example, with the command "numactl --physcpubind=0 -l a.out -m a", numactl will interpret a.out's "-m" option as its own "-m" option. To work around this problem, we put the command to be run in a shell script and then run the shell script using numactl. For example: "echo 'a.out -m a' > run.sh ; numactl --physcpubind=0 bash run.sh"

]]> Transparent Huge Pages (THP)

THP is an abstraction layer that automates most aspects of creating, managing, and using huge pages. THP is designed to hide much of the complexity in using huge pages from system administrators and developers, as normal huge pages must be assigned at boot time, can be difficult to manage manually, and often require significant changes to code in order to be used effectively. Most recent Linux OS releases have THP enabled by default.

Linux Huge Page settings

If you need finer control you can manually set huge pages using the following steps:

Create a mount point for the huge pages: mkdir /mnt/hugepages
The huge page file system needs to be mounted when the systems reboots. Add the following to a system boot configuration file before any services are started: mount -t hugetlbfs nodev /mnt/hugepages
Set vm/nr_hugepages=N in /etc/sysctl.conf where N is the maximum number of pages the system may allocate.
Reboot to have the changes take effect.

Note that further information about huge pages may be found in the Linux kernel documentation file hugetlbpage.txt.

ulimit -s <n>

Sets the stack size to n kbytes, or unlimited to allow the stack size to grow without limit.

ulimit -l <n>

Sets the maximum size of memory that may be locked into physical memory.

cpupower frequency-set -r -g performance (on Ubuntu, SLES, RHEL)

Sets the CPU governor to "performance" to enable the highest supported performance state for all cores.

powersave -f (on SuSE)

Makes the powersave daemon set the CPUs to the highest supported frequency.

/etc/init.d/cpuspeed stop (on Red Hat)

Disables the cpu frequency scaling program in order to set the CPUs to the highest supported frequency.

OMP_NUM_THREADS

Sets the maximum number of OpenMP parallel threads applications based on OpenMP may use.

LD_LIBRARY_PATH

An environment variable that indicates the location in the filesystem of bundled libraries to use when running the benchmark binaries.

kernel/randomize_va_space

This option can be used to select the type of process address space randomization that is used in the system, for architectures that support this feature.

0 - Turn the process address space randomization off. This is the default for architectures that do not support this feature anyway, and kernels that are booted with the "norandmaps" parameter.
1 - Make the addresses of mmap base, stack and VDSO page randomized. This, among other things, implies that shared libraries will be loaded to random addresses. Also for PIE-linked binaries, the location of code start is randomized. This is the default if the CONFIG_COMPAT_BRK option is enabled.
2 - Additionally enable heap randomization. This is the default if CONFIG_COMPAT_BRK is disabled.

MALLOC_CONF

An environment variable set to tune the jemalloc allocation strategy during the execution of the binaries. This environment variable setting is not needed when building the binaries on the system under test.

]]>

Compilers: AMD Optimizing C/C++ Compiler Suite 1.3.0 ]]>

Splitter rule for plugin arguments: -fplugin-arg-dragonegg-llvm-option="-flag[ -flag...]" Consumer rule for the tail of split up plugin arguments: -fplugin-arg-dragonegg-llvm-option="" -O Set the optimization level to -O2.