CPU2017 Result Flag Description

Base Portability Flags

600.perlbench_s

- -DSPEC_LP64
- PORTABILITY
- This macro specifies that the target system uses the LP64 data model; specifically, that integers are 32 bits, while longs and pointers are 64 bits.
- Includes:
- -DSPEC_LINUX_X64
- CPORTABILITY
- This macro indicates that the benchmark is being compiled on an AMD64-compatible system running the Linux operating system.
- Includes:

602.gcc_s

- -DSPEC_LP64
- PORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.

605.mcf_s

- -DSPEC_LP64
- PORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.

620.omnetpp_s

- -DSPEC_LP64
- PORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.

623.xalancbmk_s

- -DSPEC_LP64
- PORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
- -DSPEC_LINUX
- CXXPORTABILITY
- This flag can be set for SPEC compilation for LINUX using default compiler.

625.x264_s

- -DSPEC_LP64
- PORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.

631.deepsjeng_s

- -DSPEC_LP64
- PORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.

641.leela_s

- -DSPEC_LP64
- PORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.

648.exchange2_s

- -DSPEC_LP64
- PORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.

657.xz_s

- -DSPEC_LP64
- PORTABILITY
- This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.

Base Optimization Flags

C benchmarks

- -m64
- intel_icc,intel_icpc,intel_ifort,intel_icx,intel_icpx,intel_ifx
- CC, LD
- Compiles for a 64-bit (LP64) data model.
- -std=c11
- intel_icc,intel_icx,intel_icpx
- CC, LD
- Sets the language dialect to conform to the indicated C standard.
- -Wl,-z,muldefs
- EXTRA_LDFLAGS
- Enable SmartHeap and/or other library usage by forcing the linker to ignore multiple definitions if present
- -xCORE-AVX512
- COPTIMIZE
- Code is optimized for Intel(R) processors with support for CORE-AVX512 instructions. The resulting code may contain unconditional use of features that are not supported on other processors. This option also enables new optimizations in addition to Intel processor-specific optimizations including advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors.
  
  Do not use this option if you are executing a program on a processor that is not an Intel processor. If you use this option on a non-compatible processor to compile the main program (in Fortran) or the function main() in C/C++, the program will display a fatal run-time error if they are executed on unsupported processors.
- -O3
- COPTIMIZE
- Enable O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations. Enable optimizations for maximum speed, such as:
  - Loop unrolling, including instruction scheduling
  - Code replication to eliminate branches
  - Padding the size of certain power-of-two arrays to allow more efficient cache use.
  On IA-32 and Intel EM64T processors, when O3 is used with options -ax or -x (Linux) or with options /Qax or /Qx (Windows), the compiler performs more aggressive data dependency analysis than for O2, which may result in longer compilation times. The O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to O2 optimizations. The O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.
- Includes:
  - -O2
    - -O1
      
      -funroll-loops
      
      -fno-builtin
      
      -mno-ieee-fp
      
      -fomit-framepointer
      
      -ffunction-sections
      
      -ftz
- -ffast-math
- COPTIMIZE
- Enable fast math mode. This option may yield faster code for programs that do not require the guarantees of exact implementation of IEEE or ISO rules/specifications for math functions.
- -flto
- COPTIMIZE
- Performs link time optimizations, which is also known as Interprocedural Optimizations.
- -mfpmath=sse
- COPTIMIZE
- Generate floating-point arithmetic for selected unit unit. Here use scalar floating-point instructions present in the SSE instruction set
- -funroll-loops
- COPTIMIZE
- Tells the compiler the maximum number of times to unroll loops. For example -funroll-loops0 would disable unrolling of loops.
- -qopt-mem-layout-trans=4
- COPTIMIZE
- Controls the level of memory layout transformations performed by the compiler. This option can improve cache reuse and cache locality.
  - 0: Disables memory layout transformations. This is the same as specifying -qno-opt-mem-layout-trans
  - 1: Enable basic memory layout transformations like structure splitting, structure peeling, field inlining, field reordering, array field transpose, increase field alignment etc.
  - 2: Enable more memory layout transformations like advanced structure splitting. This is the same as specifying -qopt-mem-layout-trans
  - 3: Enable more memory layout transformations like copy-in/copy-out of structures for a region of code. You should only use this setting if your system has more than 4GB of physical memory per core.
  - 4: Compiler is more aggressive in using memory layout transformations. You should only use this setting if your system has more than 4GB of physical memory per core.
- -fiopenmp
- Yes
- COPTIMIZE
- -DSPEC_OPENMP
- COPTIMIZE
- Definition of this macro indicates that compilation for parallel operation is enabled, and that any OpenMP directives or pragmas will be visible to the compiler. The behavior of this macro is overridden if -DSPEC_SUPPRESS_OPENMP also appears in the list of compilation flags.
- -L/usr/local/jemalloc64-5.0.1/lib
- EXTRA_LIBS
- Specify build time link path for jemalloc 64bit built to support the CPU 2017 build. See jemalloc.net for more information.
- -ljemalloc
- EXTRA_LIBS
- Linker toggle to specify jemalloc linker library. See jemalloc.net for more information.

C++ benchmarks

- -m64
- intel_icc,intel_icpc,intel_ifort,intel_icx,intel_icpx,intel_ifx
- CXX, LD
- Compiles for a 64-bit (LP64) data model.
- -Wl,-z,muldefs
- EXTRA_LDFLAGS
- Enable SmartHeap and/or other library usage by forcing the linker to ignore multiple definitions if present
- -xCORE-AVX512
- CXXOPTIMIZE
- Code is optimized for Intel(R) processors with support for CORE-AVX512 instructions. The resulting code may contain unconditional use of features that are not supported on other processors. This option also enables new optimizations in addition to Intel processor-specific optimizations including advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors.
  
  Do not use this option if you are executing a program on a processor that is not an Intel processor. If you use this option on a non-compatible processor to compile the main program (in Fortran) or the function main() in C/C++, the program will display a fatal run-time error if they are executed on unsupported processors.
- -O3
- CXXOPTIMIZE
- Enable O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations. Enable optimizations for maximum speed, such as:
  - Loop unrolling, including instruction scheduling
  - Code replication to eliminate branches
  - Padding the size of certain power-of-two arrays to allow more efficient cache use.
  On IA-32 and Intel EM64T processors, when O3 is used with options -ax or -x (Linux) or with options /Qax or /Qx (Windows), the compiler performs more aggressive data dependency analysis than for O2, which may result in longer compilation times. The O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to O2 optimizations. The O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.
- Includes:
  - -O2
    - -O1
      
      -funroll-loops
      
      -fno-builtin
      
      -mno-ieee-fp
      
      -fomit-framepointer
      
      -ffunction-sections
      
      -ftz
- -ffast-math
- CXXOPTIMIZE
- Enable fast math mode. This option may yield faster code for programs that do not require the guarantees of exact implementation of IEEE or ISO rules/specifications for math functions.
- -flto
- CXXOPTIMIZE
- Performs link time optimizations, which is also known as Interprocedural Optimizations.
- -mfpmath=sse
- CXXOPTIMIZE
- Generate floating-point arithmetic for selected unit unit. Here use scalar floating-point instructions present in the SSE instruction set
- -funroll-loops
- CXXOPTIMIZE
- Tells the compiler the maximum number of times to unroll loops. For example -funroll-loops0 would disable unrolling of loops.
- -qopt-mem-layout-trans=4
- CXXOPTIMIZE
- Controls the level of memory layout transformations performed by the compiler. This option can improve cache reuse and cache locality.
  - 0: Disables memory layout transformations. This is the same as specifying -qno-opt-mem-layout-trans
  - 1: Enable basic memory layout transformations like structure splitting, structure peeling, field inlining, field reordering, array field transpose, increase field alignment etc.
  - 2: Enable more memory layout transformations like advanced structure splitting. This is the same as specifying -qopt-mem-layout-trans
  - 3: Enable more memory layout transformations like copy-in/copy-out of structures for a region of code. You should only use this setting if your system has more than 4GB of physical memory per core.
  - 4: Compiler is more aggressive in using memory layout transformations. You should only use this setting if your system has more than 4GB of physical memory per core.
- -L/usr/local/jemalloc64-5.0.1/lib
- EXTRA_LIBS
- Specify build time link path for jemalloc 64bit built to support the CPU 2017 build. See jemalloc.net for more information.
- -ljemalloc
- EXTRA_LIBS
- Linker toggle to specify jemalloc linker library. See jemalloc.net for more information.

Fortran benchmarks

- -m64
- intel_icc,intel_icpc,intel_ifort,intel_icx,intel_icpx,intel_ifx
- FC, LD
- Compiles for a 64-bit (LP64) data model.
- -Wl,-z,muldefs
- EXTRA_LDFLAGS
- Enable SmartHeap and/or other library usage by forcing the linker to ignore multiple definitions if present
- -xCORE-AVX512
- FOPTIMIZE
- Code is optimized for Intel(R) processors with support for CORE-AVX512 instructions. The resulting code may contain unconditional use of features that are not supported on other processors. This option also enables new optimizations in addition to Intel processor-specific optimizations including advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors.
  
  Do not use this option if you are executing a program on a processor that is not an Intel processor. If you use this option on a non-compatible processor to compile the main program (in Fortran) or the function main() in C/C++, the program will display a fatal run-time error if they are executed on unsupported processors.
- -O3
- FOPTIMIZE
- Enable O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations. Enable optimizations for maximum speed, such as:
  - Loop unrolling, including instruction scheduling
  - Code replication to eliminate branches
  - Padding the size of certain power-of-two arrays to allow more efficient cache use.
  On IA-32 and Intel EM64T processors, when O3 is used with options -ax or -x (Linux) or with options /Qax or /Qx (Windows), the compiler performs more aggressive data dependency analysis than for O2, which may result in longer compilation times. The O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to O2 optimizations. The O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.
- Includes:
  - -O2
    - -O1
      
      -funroll-loops
      
      -fno-builtin
      
      -mno-ieee-fp
      
      -fomit-framepointer
      
      -ffunction-sections
      
      -ftz
- -ffast-math
- FOPTIMIZE
- Enable fast math mode. This option may yield faster code for programs that do not require the guarantees of exact implementation of IEEE or ISO rules/specifications for math functions.
- -flto
- FOPTIMIZE
- Performs link time optimizations, which is also known as Interprocedural Optimizations.
- -mfpmath=sse
- FOPTIMIZE
- Generate floating-point arithmetic for selected unit unit. Here use scalar floating-point instructions present in the SSE instruction set
- -funroll-loops
- FOPTIMIZE
- Tells the compiler the maximum number of times to unroll loops. For example -funroll-loops0 would disable unrolling of loops.
- -qopt-mem-layout-trans=4
- FOPTIMIZE
- Controls the level of memory layout transformations performed by the compiler. This option can improve cache reuse and cache locality.
  - 0: Disables memory layout transformations. This is the same as specifying -qno-opt-mem-layout-trans
  - 1: Enable basic memory layout transformations like structure splitting, structure peeling, field inlining, field reordering, array field transpose, increase field alignment etc.
  - 2: Enable more memory layout transformations like advanced structure splitting. This is the same as specifying -qopt-mem-layout-trans
  - 3: Enable more memory layout transformations like copy-in/copy-out of structures for a region of code. You should only use this setting if your system has more than 4GB of physical memory per core.
  - 4: Compiler is more aggressive in using memory layout transformations. You should only use this setting if your system has more than 4GB of physical memory per core.
- -nostandard-realloc-lhs
- EXTRA_FOPTIMIZE
- Option standard-realloc-lhs (the default), tells the compiler that when the left-hand side of an assignment is an allocatable object, it should be reallocated to the shape of the right-hand side of the assignment before the assignment occurs. This is the current Fortran Standard definition. This feature may cause extra overhead at run time. This option has the same effect as option assume realloc_lhs.
  
  If you specify nostandard-realloc-lhs, the compiler uses the old Fortran 2003 rules when interpreting assignment statements. The left-hand side is assumed to be allocated with the correct shape to hold the right-hand side. If it is not, incorrect behavior will occur. This option has the same effect as option assume norealloc_lhs.
- -align array32byte
- EXTRA_FOPTIMIZE
- The align toggle changes how data elements are aligned. Variables and arrays are analyzed and memory layout can be altered. Specifying array32byte will look for opportunities to transform and reailgn arrays to 32byte boundaries.
- -L/usr/local/jemalloc64-5.0.1/lib
- EXTRA_LIBS
- Specify build time link path for jemalloc 64bit built to support the CPU 2017 build. See jemalloc.net for more information.
- -ljemalloc
- EXTRA_LIBS
- Linker toggle to specify jemalloc linker library. See jemalloc.net for more information.

Implicitly Included Flags

This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.

Commands and Options Used to Submit Benchmark Runs

Shell, Environment, and Other Software Settings

Red Hat Specific features

Operating System Tuning Parameters

Used to set user limits of system-wide resources. Provides control over resources available to the shell and processes started by it. Some common ulimit commands may include:

Certain Linux services may be disabled to minimize tasks that may consume CPU cycles.

Disabled through "service irqbalance stop". Depending on the workload involved, the irqbalance service reassigns various IRQ's to system CPUs. Though this service might help in some situations, disabling it can also help environments which need to minimize or eliminate latency to more quickly respond to events.

In-kernel CPU frequency governors are pre-configured power schemes for the CPU. The CPUfreq governors use P-states to change frequencies and lower power consumption. The dynamic governors can switch between CPU frequencies, based on CPU utilization to allow for power savings while not sacrificing performance.

Other options beside a generic performance governor can be set, such as the Performance governor and Powersave governor:

The governor defines the power characteristics of the system CPU, which in turn affects CPU performance. Each governor has its own unique behavior, purpose, and suitability in terms of workload.

On many Linux systems one can set the governor for all CPUs through the cpupower utility with following commands:

The tuned-adm tool is a commandline interface for switching between different tuning profiles available to the tuned tuning daeomn available in supported Linux distros. The default configuration file is located in /etc/tuned.conf and the supported profiles can be found in /etc/tune-profiles.

Some profiles that may be available by default include: default, desktop-powersave, server-powersave, laptop-ac-powersave, laptop-battery-powersave, spindown-disk, throughput-performance, latency-performance, enterprise-storage

To set a profile, one can issue the command "tuned-adm profile (profile_name)". Here are details about relevant profiles.

THP is an abstraction layer that automates most aspects of creating, managing,and using huge pages. It is designed to hide much of the complexity in using huge pages from system administrators and developers. Huge pages increase the memory page size from 4 kilobytes to 2 megabytes. This provides significant performance advantages on systems with highly contended resources and large memory workloads. If memory utilization is too high or memory is badly fragmented which prevents hugepages being allocated, the kernel will assign smaller 4k pages instead. Most recent Linux OS releases have THP enabled by default.

THP usage is controlled by the sysfs setting /sys/kernel/mm/transparent_hugepage/enabled.

THP creation is controlled by the sysfs setting /sys/kernel/mm/transparent_hugepage/defrag.

An application that "always" requests THP often can benefit from waiting for an allocation until those huge pages can be assembled.

Firmware / BIOS / Microcode Settings

Enables Logical processor (Software Method to Enable/Disable Logical Processor threads)

Disable supports 1-cluster and 4-IMC way interleave. Enable SNC2 supports 2-clusters SNC and 2-way IMC interleave. Enable SNC4 supports 4-cluster and 1-IMC way interleave, Auto - Auto decides based on Si Compatibility.

Enable - opportunistically fill dead lines in LLC. Disable - never fill dead lines in LLC, Auto - Auto decides based on Si Compatibility.

This option allows for correction of soft memory errors. Over the length of system runtime, the risk of producing multi-bit and uncorrected errors is reduced with this option. Values for this BIOS setting can be:

This option configures the processor Xtended Prediciton Table (XPT) prefetch feature. The XPT prefetcher exists on top of other prefetchers that that can prefetch data in the core DCU, MLC, and LLC. The XPT prefetcher will issue a speculative DRAM read request in parallel to an LLC lookup. This prefetch bypasses the LLC, saving latency. In some cases, setting this option to disabled can improve performance. In some cases, setting this option to disabled can improve performance. Typically, setting this option to enable provides better performance. This option must be enabled when Sub-NUMA Clustering is enabled. Values for this BIOS option can be:

This prefetcher is a L1 data cache prefetcher, which detects multiple loads from the same cache line done within a time limit, in order to then prefetch the next line from the L2 cache or the main memory into the L1 cache based on the assumption that the next cache line will also be needed.

Use input from ENERGY_PERF_BIAS_CONFIG mode selection. PERF/Balanced Perf/Balanced Power/Power

Enable/Disable Intel Virtualization Technology for Directed I/O (VT-d) by reporting the I/O device assignment to VMM through DMAR ACPI Tables.

Disable: Hardware chooses a P-state based on OS Request (Legacy P-States) Native Mode:Hardware chooses a P-state based on OS guidance Out of Band Mode:Hardware autonomously chooses a P-state (no OS guidance)

For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact info@spec.org
Copyright 2017-2024 Standard Performance Evaluation Corporation
Tested with SPEC CPU2017 v1.1.9.
Report generated on 2024-01-29 17:23:55 by SPEC CPU2017 flags formatter v5178.

CPU2017 Flag Description
Quanta Cloud Technology D54Q-2U (Intel Xeon Platinum 8480+, 2.0GHz)

Test sponsored by Quanta Computer Inc.

Base Compiler Invocation

C benchmarks

C++ benchmarks

Fortran benchmarks

Base Portability Flags

600.perlbench_s

602.gcc_s

605.mcf_s

620.omnetpp_s

623.xalancbmk_s

625.x264_s

631.deepsjeng_s

641.leela_s

648.exchange2_s

657.xz_s

Base Optimization Flags

C benchmarks

C++ benchmarks

Fortran benchmarks

Implicitly Included Flags

Commands and Options Used to Submit Benchmark Runs

Shell, Environment, and Other Software Settings

Red Hat Specific features

Operating System Tuning Parameters

Firmware / BIOS / Microcode Settings

	Indicates that the flag description came from the user flags file.
	Indicates that the flag description came from the suite-wide flags file.
	Indicates that the flag description came from a per-benchmark flags file.

CPU2017 Flag DescriptionQuanta Cloud Technology D54Q-2U (Intel Xeon Platinum 8480+, 2.0GHz)

Test sponsored by Quanta Computer Inc.

Base Compiler Invocation

Base Portability Flags

Base Optimization Flags

Implicitly Included Flags

Red Hat Specific features

CPU2017 Flag Description
Quanta Cloud Technology D54Q-2U (Intel Xeon Platinum 8480+, 2.0GHz)