CPU2017 Flag Description
Lenovo Global Technology ThinkSystem SR635 2.00 GHz, AMD EPYC 7702

This result has been formatted using multiple flags files. The "default header section" from each of them appears next.


Default header section from aocc130-flags-revA21

AMD Optimizing C/C++ Compiler Suite Version 1.3.0 SPEC CPU2017 Flag Description

Compilers: AMD Optimizing C/C++ Compiler Suite 1.3.0


Default header section from gcc

GNU Compiler Collection Flags

Flag descriptions for GCC, the GNU Compiler Collection

Note: The GNU Compiler Collection provides a wide array of compiler options, described in detail and readily available at https://gcc.gnu.org/onlinedocs/gcc/Option-Index.html#Option-Index and https://gcc.gnu.org/onlinedocs/gfortran/. This SPEC CPU flags file contains excerpts from and brief summaries of portions of that documentation.

SPEC's modifications are:
Copyright (C) 2006-2017 Standard Performance Evaluation Corporation

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being "Funding Free Software", the Front-Cover Texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in your SPEC CPU kit at $SPEC/Docs/licenses/FDL.v1.3 and on the web at http://www.spec.org/cpu2017/Docs/licenses/FDL.v1.3. A copy of "Funding Free Software" is on your SPEC CPU kit at $SPEC/Docs/licenses/FundingFreeSW and on the web at http://www.spec.org/cpu2017/Docs/licenses/FundingFreeSW.

(a) The FSF's Front-Cover Text is:

A GNU Manual

(b) The FSF's Back-Cover Text is:

You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development.


Base Compiler Invocation

C benchmarks

Fortran benchmarks

Benchmarks using both Fortran and C

Benchmarks using Fortran, C, and C++


Peak Compiler Invocation

C benchmarks

Fortran benchmarks

Benchmarks using both Fortran and C

Benchmarks using Fortran, C, and C++


Base Portability Flags

603.bwaves_s

607.cactuBSSN_s

619.lbm_s

621.wrf_s

627.cam4_s

628.pop2_s

638.imagick_s

644.nab_s

649.fotonik3d_s

654.roms_s


Peak Portability Flags

603.bwaves_s

607.cactuBSSN_s

619.lbm_s

621.wrf_s

627.cam4_s

628.pop2_s

638.imagick_s

644.nab_s

649.fotonik3d_s

654.roms_s


Base Optimization Flags

C benchmarks

Fortran benchmarks

Benchmarks using both Fortran and C

Benchmarks using Fortran, C, and C++


Peak Optimization Flags

C benchmarks

619.lbm_s

638.imagick_s

644.nab_s

Fortran benchmarks

603.bwaves_s

649.fotonik3d_s

654.roms_s

Benchmarks using both Fortran and C

621.wrf_s

627.cam4_s

628.pop2_s

Benchmarks using Fortran, C, and C++


Base Other Flags

C benchmarks

Fortran benchmarks

Benchmarks using both Fortran and C

Benchmarks using Fortran, C, and C++


Peak Other Flags

C benchmarks

Fortran benchmarks

Benchmarks using both Fortran and C

Benchmarks using Fortran, C, and C++


Implicitly Included Flags

This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.


Commands and Options Used to Submit Benchmark Runs

This result has been formatted using multiple flags files. The "submit command" from each of them appears next.


Submit command from aocc130-flags-revA21

AMD Optimizing C/C++ Compiler Suite Version 1.3.0 SPEC CPU2017 Flag Description

Using numactl to bind processes and memory to cores

For multi-copy runs or single copy runs on systems with multiple sockets, it is advantageous to bind a process to a particular core. Otherwise, the OS may arbitrarily move your process from one core to another. This can affect performance. To help, SPEC allows the use of a "submit" command where users can specify a utility to use to bind processes. We have found the utility 'numactl' to be the best choice.

numactl runs processes with a specific NUMA scheduling or memory placement policy. The policy is set for a command and inherited by all of its children. The numactl flag "--physcpubind" specifies which core(s) to bind the process. "-l" instructs numactl to keep a process's memory on the local node while "-m" specifies which node(s) to place a process's memory. For full details on using numactl, please refer to your Linux documentation, 'man numactl'

Note that some older versions of numactl incorrectly interpret application arguments as its own. For example, with the command "numactl --physcpubind=0 -l a.out -m a", numactl will interpret a.out's "-m" option as its own "-m" option. To work around this problem, we put the command to be run in a shell script and then run the shell script using numactl. For example: "echo 'a.out -m a' > run.sh ; numactl --physcpubind=0 bash run.sh"


Submit command from gcc

GNU Compiler Collection Flags

SPECrate runs might use one of these methods to bind processes to specific processors, depending on the config file.


Commands and Options Used for Feedback-Directed Optimization

No special commands are needed for feedback-directed optimization, other than the compiler profile  flags.


Shell, Environment, and Other Software Settings

This result has been formatted using multiple flags files. The "sw environment" from each of them appears next.


Sw environment from aocc130-flags-revA21

AMD Optimizing C/C++ Compiler Suite Version 1.3.0 SPEC CPU2017 Flag Description

Transparent Huge Pages (THP)

THP is an abstraction layer that automates most aspects of creating, managing, and using huge pages. THP is designed to hide much of the complexity in using huge pages from system administrators and developers, as normal huge pages must be assigned at boot time, can be difficult to manage manually, and often require significant changes to code in order to be used effectively. Most recent Linux OS releases have THP enabled by default.

Linux Huge Page settings

If you need finer control you can manually set huge pages using the following steps:

Note that further information about huge pages may be found in the Linux kernel documentation file hugetlbpage.txt.

ulimit -s <n>

Sets the stack size to n kbytes, or unlimited to allow the stack size to grow without limit.

ulimit -l <n>

Sets the maximum size of memory that may be locked into physical memory.

cpupower frequency-set -r -g performance (on Ubuntu, SLES, RHEL)

Sets the CPU governor to "performance" to enable the highest supported performance state for all cores.

powersave -f (on SuSE)

Makes the powersave daemon set the CPUs to the highest supported frequency.

/etc/init.d/cpuspeed stop (on Red Hat)

Disables the cpu frequency scaling program in order to set the CPUs to the highest supported frequency.

OMP_NUM_THREADS

Sets the maximum number of OpenMP parallel threads applications based on OpenMP may use.

LD_LIBRARY_PATH

An environment variable that indicates the location in the filesystem of bundled libraries to use when running the benchmark binaries.

kernel/randomize_va_space

This option can be used to select the type of process address space randomization that is used in the system, for architectures that support this feature.

MALLOC_CONF

An environment variable set to tune the jemalloc allocation strategy during the execution of the binaries. This environment variable setting is not needed when building the binaries on the system under test.


Sw environment from gcc

GNU Compiler Collection Flags

One or more of the following may have been used in the run. If so, it will be listed in the notes sections. Here is a brief guide to understanding them:


Operating System Tuning Parameters

sched_cfs_bandwidth_slice_us
This OS setting controls the amount of run-time(bandwidth) transferred to a run queue from the task's control group bandwidth pool. Small values allow the global bandwidth to be shared in a fine-grained manner among tasks, larger values reduce transfer overhead. The default value is 5000 (ns).
sched_latency_ns
This OS setting configures targeted preemption latency for CPU bound tasks. The default value is 24000000 (ns).
sched_migration_cost_ns
Amount of time after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations. The default value is 500000 (ns).
sched_min_granularity_ns
This OS setting controls the minimal preemption granularity for CPU bound tasks. As the number of runnable tasks increases, CFS(Complete Fair Scheduler), the scheduler of the Linux kernel, decreases the timeslices of tasks. If the number of runnable tasks exceeds sched_latency_ns/sched_min_granularity_ns, the timeslice becomes number_of_running_tasks * sched_min_granularity_ns. The default value is 8000000 (ns).
sched_wakeup_granularity_ns
This OS setting controls the wake-up preemption granularity. Increasing this variable reduces wake-up preemption, reducing disturbance of compute bound tasks. Lowering it improves wake-up latency and throughput for latency critical tasks, particularly when a short duty cycle load component must compete with CPU bound components. The default value is 10000000 (ns).
numa_balancing
This OS setting controls automatic NUMA balancing on memory mapping and process placement. Setting 0 disables this feature. It is enabled by default (1).

Firmware / BIOS / Microcode Settings

Operating Modes Selections: (Default="Maximum Efficiency")
Select the operating mode based on your preference. Note, power savings and performance are also highly on hardware and software running on system.
Determinism Slider:
Auto = Use default performance determinism settings Power Performance.
Global C-state Control:
Controls IO based C-state generation and DF C-states.
cTDP Control:
Auto = Use the fused cTDP Manual = User can set customized cTDP.
NUMA nodes per socket:
Specifies the number of desired NUMA nodes per socket. Zero will attempt to interleave the two sockets together.
Package Power Limit Control:
Auto = Use the fused PPT\nManual = User can set customized PPT\n***PPT will be used as the ASIC power limit***
SMT Mode:
Can be used to disable symmetric multithreading. To re-enable SMT, a POWER CYCLE is needed after selecting the 'Auto' option. WARNING - S3 is NOT SUPPORTED on systems where SMT is disabled.
CCD Control:
Sets the number of CCDs to be used. Once this option has been used to remove any CCDs, a POWER CYCLE is required in order for future selections to take effect.
EfficiencyModeEn:
0 = use performance optimized CCLK DPM settings\n1 = use power efficiency optimized CCLK DPM settings
LCC as NUMA Node:
Exposes the processor's last level caches as NUMA nodes. When enabled, can improve performance for highly NUMA optimized workloads if workloads or components of workloads can be pinned into the caches.

Flag description origin markings:

[user] Indicates that the flag description came from the user flags file.
[suite] Indicates that the flag description came from the suite-wide flags file.
[benchmark] Indicates that the flag description came from a per-benchmark flags file.

The flags files that were used to format this result can be browsed at
http://www.spec.org/cpu2017/flags/aocc130-flags-revA21.html,
http://www.spec.org/cpu2017/flags/gcc.2019-08-07.html,
http://www.spec.org/cpu2017/flags/Lenovo-Platform-SPECcpu2017-Flags-V1.2-Rome-A.html.

You can also download the XML flags sources by saving the following links:
http://www.spec.org/cpu2017/flags/aocc130-flags-revA21.xml,
http://www.spec.org/cpu2017/flags/gcc.2019-08-07.xml,
http://www.spec.org/cpu2017/flags/Lenovo-Platform-SPECcpu2017-Flags-V1.2-Rome-A.xml.


For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact info@spec.org
Copyright 2017-2019 Standard Performance Evaluation Corporation
Tested with SPEC CPU2017 v1.0.5.
Report generated on 2019-09-03 14:51:02 by SPEC CPU2017 flags formatter v5178.