CPU2017 Flag Description
Quanta Cloud Technology QuantaGrid S43KL-1U (AMD EPYC 7742 64-Core, 2.25GHz)

Test sponsored by Quanta Computer Inc.

Compilers: AMD Optimizing C/C++ Compiler Suite


Base Compiler Invocation

C benchmarks

C++ benchmarks

Fortran benchmarks


Peak Compiler Invocation

C benchmarks

C++ benchmarks

Fortran benchmarks


Base Portability Flags

600.perlbench_s

602.gcc_s

605.mcf_s

620.omnetpp_s

623.xalancbmk_s

625.x264_s

631.deepsjeng_s

641.leela_s

648.exchange2_s

657.xz_s


Peak Portability Flags

600.perlbench_s

602.gcc_s

605.mcf_s

620.omnetpp_s

623.xalancbmk_s

625.x264_s

631.deepsjeng_s

641.leela_s

648.exchange2_s

657.xz_s


Base Optimization Flags

C benchmarks

C++ benchmarks

Fortran benchmarks


Peak Optimization Flags

C benchmarks

600.perlbench_s

602.gcc_s

605.mcf_s

625.x264_s

657.xz_s

C++ benchmarks

620.omnetpp_s

623.xalancbmk_s

631.deepsjeng_s

641.leela_s

Fortran benchmarks


Base Other Flags

C benchmarks

C++ benchmarks

Fortran benchmarks


Peak Other Flags

C benchmarks

C++ benchmarks (except as noted below)

623.xalancbmk_s

Fortran benchmarks


Implicitly Included Flags

This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.


Commands and Options Used to Submit Benchmark Runs

Using numactl to bind processes and memory to cores

For multi-copy runs or single copy runs on systems with multiple sockets, it is advantageous to bind a process to a particular core. Otherwise, the OS may arbitrarily move your process from one core to another. This can affect performance. To help, SPEC allows the use of a "submit" command where users can specify a utility to use to bind processes. We have found the utility 'numactl' to be the best choice.

numactl runs processes with a specific NUMA scheduling or memory placement policy. The policy is set for a command and inherited by all of its children. The numactl flag "--physcpubind" specifies which core(s) to bind the process. "-l" instructs numactl to keep a process's memory on the local node while "-m" specifies which node(s) to place a process's memory. For full details on using numactl, please refer to your Linux documentation, 'man numactl'

Note that some older versions of numactl incorrectly interpret application arguments as its own. For example, with the command "numactl --physcpubind=0 -l a.out -m a", numactl will interpret a.out's "-m" option as its own "-m" option. To work around this problem, we put the command to be run in a shell script and then run the shell script using numactl. For example: "echo 'a.out -m a' > run.sh ; numactl --physcpubind=0 bash run.sh"


Shell, Environment, and Other Software Settings

Transparent Huge Pages (THP)

THP is an abstraction layer that automates most aspects of creating, managing, and using huge pages. THP is designed to hide much of the complexity in using huge pages from system administrators and developers, as normal huge pages must be assigned at boot time, can be difficult to manage manually, and often require significant changes to code in order to be used effectively. Most recent Linux OS releases have THP enabled by default.

Linux Huge Page settings

If you need finer control you can manually set huge pages using the following steps:

Note that further information about huge pages may be found in the Linux kernel documentation file hugetlbpage.txt.

ulimit -s <n>

Sets the stack size to n kbytes, or unlimited to allow the stack size to grow without limit.

ulimit -l <n>

Sets the maximum size of memory that may be locked into physical memory.

powersave -f (on SuSE)

Makes the powersave daemon set the CPUs to the highest supported frequency.

/etc/init.d/cpuspeed stop (on Red Hat)

Disables the cpu frequency scaling program in order to set the CPUs to the highest supported frequency.

LD_LIBRARY_PATH

An environment variable that indicates the location in the filesystem of bundled libraries to use when running the benchmark binaries.

kernel/randomize_va_space

This option can be used to select the type of process address space randomization that is used in the system, for architectures that support this feature.

MALLOC_CONF

An environment variable set to tune the jemalloc allocation strategy during the execution of the binaries. This environment variable setting is not needed when building the binaries on the system under test.


Operating System Tuning Parameters

OS Tuning

ulimit:

Used to set user limits of system-wide resources. Provides control over resources available to the shell and processes started by it. Some common ulimit commands may include:

Disabling Linux services:

Certain Linux services may be disabled to minimize tasks that may consume CPU cycles.

irqbalance:

Disabled through "service irqbalance stop". Depending on the workload involved, the irqbalance service reassigns various IRQ's to system CPUs. Though this service might help in some situations, disabling it can also help environments which need to minimize or eliminate latency to more quickly respond to events.

Performance Governors (Linux):

In-kernel CPU frequency governors are pre-configured power schemes for the CPU. The CPUfreq governors use P-states to change frequencies and lower power consumption. The dynamic governors can switch between CPU frequencies, based on CPU utilization to allow for power savings while not sacrificing performance.

Other options beside a generic performance governor can be set, such as the perf-bias:

--perf-bias, -b

On supported Intel processors, this option sets a register which allows the cpupower utility (or other software/firmware) to set a policy that controls the relative importance of performance versus energy savings to the processor. The range of valid numbers is 0-15, where 0 is maximum performance and 15 is maximum energy efficiency.

The processor uses this information in model-specific ways when it must select trade-offs between performance and energy efficiency. This policy hint does not supersede Processor Performance states (P-states) or CPU Idle power states (C-states), but allows software to have influence where it would otherwise be unable to express a preference.

On many Linux systems one can set the perf-bias for all CPUs through the cpupower utility with one of the following commands:

Tuning Kernel parameters:

The following Linux Kernel parameters were tuned to better optimize performance of some areas of the system:

tuned-adm:

The tuned-adm tool is a commandline interface for switching between different tuning profiles available to the tuned tuning daeomn available in supported Linux distros. The default configuration file is located in /etc/tuned.conf and the supported profiles can be found in /etc/tune-profiles.

Some profiles that may be available by default include: default, desktop-powersave, server-powersave, laptop-ac-powersave, laptop-battery-powersave, spindown-disk, throughput-performance, latency-performance, enterprise-storage

To set a profile, one can issue the command "tuned-adm profile (profile_name)". Here are details about relevant profiles.

Transparent Huge Pages (THP):

THP is an abstraction layer that automates most aspects of creating, managing, and using huge pages. THP is designed to hide much of the complexity in using huge pages from system administrators and developers, as normal huge pages must be assigned at boot time, can be difficult to manage manually, and often require significant changes to code in order to be used effectively. Transparent Hugepages increase the memory page size from 4 kilobytes to 2 megabytes. Transparent Hugepages provide significant performance advantages on systems with highly contended resources and large memory workloads. If memory utilization is too high or memory is badly fragmented which prevents hugepages being allocated, the kernel will assign smaller 4k pages instead. Most recent Linux OS releases have THP enabled by default.

Linux Huge Page settings:

If you need finer control and manually set the Huge Pages you can follow the below steps:

Note that further information about huge pages may be found in your Linux documentation file: /usr/src/linux/Documentation/vm/hugetlbpage.txt


Firmware / BIOS / Microcode Settings

Determinism Control:
This BIOS option allows user to choose AGESA determinism control. Available settings are:
Determinism Slider:
Selects the determinism mode for the CPU:
cTDP Control(Configurable TDP):
TDP is an acronym for “Thermal Design Power.” TDP is the recommended target for power used when designing the cooling capacity for a server. Configures the maximum power that the CPU will consume, up to the platform power limit (PPT). Valid values vary by CPU model. If value outside the valid range is set, the CPU will automatically adjust the value so that it does fall within the valid range. When increasing cTDP, additional power will only be consumed up to the Package Power Limit (PPT), which may be less than the cTDP setting.
Model Nominal TDP Minimum cTDP Maximum cTDP**
EPYC 7742 225 225 240
EPYC 7702 200 165 200
Package Power Limit (PPT) Control:
Specifies the maximum power that each CPU package may consume in the system. The actual power limit is the maximum of the Package Power Limit and cTDP. Available settings are:
NUMA nodes per socket (NPS):
Non-Uniform Memory Architecture (NUMA) enables the CPU cores to access memory via NUMA domains / nodes. Users can specify the number of desired NUMA nodes per populated socket in the system:
SMT Control:
Can be used to disable symmetric multithreading. To re-enable SMT, a POWER CYCLE is needed after selecting the 'Auto' option. WARNING - S3 is NOT SUPPORTED on systems where SMT is disabled.
IOMMU:
Enable: Enables the I/O Memory Management Unit (IOMMU), which extends the AMD64 system architecture by adding support for address translation and system memory access protection on DMA transfers from peripheral devices.
APBDIS:
APBDis is an IO Boost disable on uncore. For any system user that needs to block these uncore optimizations that are impacting base core clock speed, we are exposing a method to disable this behavior called APBDis. This locks the fabric clock to the non-boosted speeds. Available settings are:

Last updated Oct. 29, 2019.


Flag description origin markings:

[user] Indicates that the flag description came from the user flags file.
[suite] Indicates that the flag description came from the suite-wide flags file.
[benchmark] Indicates that the flag description came from a per-benchmark flags file.

The flags files that were used to format this result can be browsed at
http://www.spec.org/cpu2017/flags/aocc200-flags-B1_v2.html,
http://www.spec.org/cpu2017/flags/Quanta-Computer-Inc-amd-speccpu-setting-v1_AMD_ROME.html.

You can also download the XML flags sources by saving the following links:
http://www.spec.org/cpu2017/flags/aocc200-flags-B1_v2.xml,
http://www.spec.org/cpu2017/flags/Quanta-Computer-Inc-amd-speccpu-setting-v1_AMD_ROME.xml.


For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact info@spec.org
Copyright 2017-2019 Standard Performance Evaluation Corporation
Tested with SPEC CPU2017 v1.0.5.
Report generated on 2019-11-26 12:45:36 by SPEC CPU2017 flags formatter v5178.