Compilers: IBM XL C/C++ Version 16.1.0 for Linux
Compilers: IBM XL Fortran Version 16.1.0 for Linux
Libraries: IBM Advance Toolchain version 11.0-0 Available for download at : https://ibm.biz/AdvanceToolchain
IBM Post Link Optimizer : IBM Feedback Directed Program Restructing (FDPR) for Linux on Power 5.6.4-0 Available for download at : https://developer.ibm.com/linuxonpower/sdk-packages/
Operating systems: SLES 12 SP3
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The xlc_r invocation is thread-safe version of xlc compiler. The xlc_at and xlc_r_at invocations link with the IBM Advance Toolchain libraries.
Only 64-bit compilation is supported.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Compilation conforms to the ISO C99 standard and accepts implementation-specific language extensions.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The xlC_r invocation is thread-safe version of xlC compiler. The xlC_at and xlC_r_at invocations link with the IBM Advance Toolchain libraries. Only 64-bit compilation is supported.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The xlf95_r invocation is thread-safe version of xlf95 compiler. The xlf95_at and xlf95_r_at invocations link with the IBM Advance Toolchain libraries. Only 64-bit compilation is supported.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The xlc_r invocation is thread-safe version of xlc compiler. The xlc_at and xlc_r_at invocations link with the IBM Advance Toolchain libraries.
Only 64-bit compilation is supported.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Compilation conforms to the ISO C99 standard and accepts implementation-specific language extensions.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The xlC_r invocation is thread-safe version of xlC compiler. The xlC_at and xlC_r_at invocations link with the IBM Advance Toolchain libraries. Only 64-bit compilation is supported.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The xlf95_r invocation is thread-safe version of xlf95 compiler. The xlf95_at and xlf95_r_at invocations link with the IBM Advance Toolchain libraries. Only 64-bit compilation is supported.
![[benchmark]](https://www.spec.org/auto/cpu2017/images/benchmark.png)
This macro specifies that the target system uses the LP64 data model; specifically, that integers are 32 bits, while longs and pointers are 64 bits.
![[benchmark]](https://www.spec.org/auto/cpu2017/images/benchmark.png)
This macro indicates that the benchmark is being compiled on a little-endian PowerPC system running the Linux operating system.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[benchmark]](https://www.spec.org/auto/cpu2017/images/benchmark.png)
This flag can be set for SPEC compilation for LINUX using default compiler.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[benchmark]](https://www.spec.org/auto/cpu2017/images/benchmark.png)
This macro specifies that the target system uses the LP64 data model; specifically, that integers are 32 bits, while longs and pointers are 64 bits.
![[benchmark]](https://www.spec.org/auto/cpu2017/images/benchmark.png)
This macro indicates that the benchmark is being compiled on a little-endian PowerPC system running the Linux operating system.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[benchmark]](https://www.spec.org/auto/cpu2017/images/benchmark.png)
This flag can be set for SPEC compilation for LINUX using default compiler.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[suite]](https://www.spec.org/auto/cpu2017/images/suite.png)
This option is used to indicate that the host system's integers are 32-bits wide, and longs and pointers are 64-bits wide. Not all benchmarks recognize this macro, but the preferred practice for data model selection applies the flags to all benchmarks; this flag description is a placeholder for those benchmarks that do not recognize this macro.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-qalias=ansi | noansi :
If ansi is specified, type-based aliasing is used during optimization, which restricts the lvalues that can be safely used to access a data object. The default is ansi for the xlc, xlC, and c89 commands. This option has no effect unless you also specify the -O option.qalias=std |nostd :
Indicates whether the compilation units contain any non-standard aliasing. If so, specify nostd.![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-O5 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-O4 is equivalent to the following flags:
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-O4 is equivalent to the following flags:
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
Example : -qprefetch=dscr=42
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Pass the -q flag to the linker causing the final executable to have the relocation information.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The optimizations provided include:
-O3 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The inline option specifies the threshold and limit of inlined functions. Example : -qinline=40.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
Example : -qprefetch=dscr=42
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-qalias=ansi | noansi :
If ansi is specified, type-based aliasing is used during optimization, which restricts the lvalues that can be safely used to access a data object. The default is ansi for the xlc, xlC, and c89 commands. This option has no effect unless you also specify the -O option.qalias=std |nostd :
Indicates whether the compilation units contain any non-standard aliasing. If so, specify nostd.![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Disables transformations that may produce incorrect results in the presence of, or that may incorrectly produce IEEE floating-point NaN (not-a-number) values.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Pass the -q flag to the linker causing the final executable to have the relocation information.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The optimizations provided include:
-O3 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Instructs the linker to allow multiple definitions and the first definition will be used. Normally when a symbol is defined multiple times, the linker will report a fatal error.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-qalias=ansi | noansi :
If ansi is specified, type-based aliasing is used during optimization, which restricts the lvalues that can be safely used to access a data object. The default is ansi for the xlc, xlC, and c89 commands. This option has no effect unless you also specify the -O option.qalias=std |nostd :
Indicates whether the compilation units contain any non-standard aliasing. If so, specify nostd.![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Pass the -q flag to the linker causing the final executable to have the relocation information.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-O5 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
This option indicates to the compiler that each dynamic object allocated in the program fits within the size of 4GB.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
Example : -qprefetch=dscr=42
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Pass the -q flag to the linker causing the final executable to have the relocation information.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-O5 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
This flag is equivalent to -qunroll=no.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Adds the restrict type qualifier to the pointer parameters within all functions without modifying the source file.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
Example : -qprefetch=dscr=42
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Pass the -q flag to the linker causing the final executable to have the relocation information.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-O5 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Asserts the minimum physical pagesize during program execution.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
Example : -qprefetch=dscr=42
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Pass the -q flag to the linker causing the final executable to have the relocation information.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-O5 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Assumes that all functions with the name of an ANSI C defined library function are, in fact, the library functions.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
Example : -qprefetch=dscr=42
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Pass the -q flag to the linker causing the final executable to have the relocation information.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-O5 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Asserts the minimum physical pagesize during program execution.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
Example : -qprefetch=dscr=42
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Pass the -q flag to the linker causing the final executable to have the relocation information.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Enhances optimization by doing detailed analysis across procedures (interprocedural analysis or IPA).
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Adds the restrict type qualifier to the pointer parameters within all functions without modifying the source file.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
Example : -qprefetch=dscr=42
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Pass the -q flag to the linker causing the final executable to have the relocation information.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-O5 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Tell the compiler that enum size is small.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Instructs the compiler to search for more opportunities for loop unrolling than that performed with -funroll-loops. In general, -funroll-all-loops has more chances to increase compile time or program size than -funroll-loops processing, but it might also improve your application's performance.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The inline option specifies the threshold and limit of inlined functions. Example : -qinline=40.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
Example : -qprefetch=dscr=42
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both execution path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The optimizations provided include:
-O3 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
Specifying -qhot without suboptions implies -qhot=nosimd, -qhot=noarraypad, -qhot=vector and -qhot=level=1. The -qhot option is also implied by -O4 and -O5 .
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Reduces the size of the stack frame. Programs that allocate large amounts of data to the stack, such as threaded programs, may result in stack overflows. This option can reduce the size of the stack frame to help avoid overflows.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
Example : -qprefetch=dscr=42
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Link with tcmalloc library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase. This option adds -qipa=level=1 if that is not already set.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
-O4 is equivalent to the following flags:
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The optimizations provided include:
-O3 is equivalent to the following flags :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
Specifying -qhot without suboptions implies -qhot=nosimd, -qhot=noarraypad, -qhot=vector and -qhot=level=1. The -qhot option is also implied by -O4 and -O5 .
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Enhances optimization by doing detailed analysis across procedures (interprocedural analysis or IPA). The level determines the amount of interprocedural analysis and optimization that is performed.
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Supported values for this flag are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
The supported values for suboption are :
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
![[user]](https://www.spec.org/auto/cpu2017/images/user.png)
Enhances optimization by doing detailed analysis across procedures (interprocedural analysis or IPA). The level determines the amount of interprocedural analysis and optimization that is performed.
submit = numactl -l -C $BIND $command
-l
Allocates memory from the local node of the cpu.
-C
Only execute process on cpus. This accepts physical cpu numbers
as shown in the processor fields of /proc/cpuinfo.
fdprpro is a Feedback Directed Program Restructuring optimization tool that is available for the IBM POWER platform. It can be used optionally during FDO.
An example command to invoke fdprpro for the optimization pass is:
Additional details regarding usage and flags are provided below.
Usage:
fdprpro -a/--action [action] [options] program
where `program' specifies the input program in the form of an executable or a shared object
[action] can be one of the following:
anl analyze program
instr generate instrumented program for profile gathering
opt generate optimized program
check_sign check FDPR signature in the input program
sample generate script file for collecting sampled profile
[options] can be one of the following:
Analysis Options:
-aawc, --analyze-assembly-written-csects
Analyze objects written in Assembly.
-acf <analysis configuration file>, --analysis-configuration-file <analysis configuration file>
Provide a configuration file of analysis information (advanced option)
-asd/-noasd, --analyze-static-data/--noanalyze-static-data
-ifl <file>, --ignored-function-list <file>
Set the ignored function list. The file contains names of functions
that considered as unsafe and thus are not modified
Instrumentation Options:
-fd <Fdesc>, --file-descriptor <Fdesc>
Set the file descriptor number to be used when opening the profile
file. The default of <Fdesc> is set to the maximum-allowed number of
open files
-icvp, --instr-call-value-profiling
instrument the values of parameters passed in function calles
-imullX, --mullX-instrumentation
perform value profiling of RA and RB operands in mullX instructions
-iderat, --derat-instrumentation
Perform value profiling of RA and RB operands in load/store indexed
instructions
-issu, --instrumentation-safe-stack-usage
Ensure that additional stack space is properly allocated for the
instrumented run. Use this option if your application uses the stack
extensively (e.g., when the program uses alloca()). Note that this
option adds extra overhead on instrumentation code
-iso <offset>, --instrumentation-stack-offset <offset>
Set the offset from the stack, a negative number, where the
instrumentation's area for saving registers is kept at runtime. Use
with care
-M <addr>, --profile-map <addr>
Set the shared memory segment address for profiling. Alternative shared
memory addresses are needed when the instrumented program application
creates a conflict with the shared-memory addresses preserved for the
profiling. Typical alternative values are 0x40000000, 0x50000000, ...
up to 0xC0000000. The default is set to 0x3000000
-ptm, --profile-to-memory
Use shared memory key instead of file mapping to obtain a shared memory
area for the profile data
-ri/-nori, --register-instrumentation/--noregister-instrumentation
Instrument/Do not instrument the input program file to collect profile
information about indirect branches via registers. The default is set
to collect the profile information
-sfp/-nosfp, --save-floating-point-registers/--nosave-floating-point-registers
Save/Do not save floating point registers in instrumented code. The
default is set to save floating point registers
-shmkey <key number>, --shared-memory-key <key number>
Specify a shared memory key to use when creating a shared memory area
for the profile. The default key is created by hashing the profile
file name (with ftok).
Profile Files Options:
-af <prof_file>, --ascii-profile-file <prof_file>
Set the name of a text format profile file containing profile
information.
-aop, --accept-old-profile
Accept the old profile file collected on previous versions of the input
program file (requires the -f flag)
-f <prof_file>, --profile-file <prof_file>
Set the profile file name. The profile file is created during the
instrumentation phase and read during the optimization phase. The
profile file is updated each time you run the instrumented program
-fdir <prof_file_dir>, --profile-file-directory <prof_file_dir>
Set the run-time location of the profile file. The profile will be
search during the profiling phase at this location. The default
location is the path given in the profile file name (-f option).
Applicable only at instrumentation phase
Optimization Options:
-A <alignment>, --align-code <alignment>
Specify code alignment strategy. 1: Use grouping rules of target
machine (default), 2: Same as 1 but consider also hotness of branch
targets. See -m for the selected machine model.
-abb <factor>, --align-basic-blocks <factor>
Align basic blocks that are hotter than the average by a given (float)
<factor>. This is a lower-level machine-specific alignment compared to
--align-code. Value of -1 (the default) disables this option
-bf, --branch-folding
Eliminate branch to branch instructions
-ccc <threshold>, --cold-code-connector <threshold>
Preserves original order for code which is less frequently executed
than given threshold
-bp, --branch-prediction
Set branch prediction bit for conditional branches according to the
collected profile
-cbpth, --cold-branch-prediction-threshold
Set the Cold Branch Prediction Threshold for branch prediction
optimization. Branches whose execution count relative to the average
is below this value will be statically predicted. Allowed values are
between (0,1). Default is -1 - optimization is not applied.
(Applicable only with the -bp flag)
-pbp, --preserve-branch-predication
Preserve branch predication pattern (bc+8) and avoid code reordering
and branch prediction
-cbsi, --chain-based-selective-inline
Perform selective inlining of functions that produce long hot chains of
code
-dce, --dead-code-elimination
Eliminate instructions related to unused local variables within
frequently executed functions. This is useful mainly after applying
function inlining optimization
-dp, --data-prefetch
Insert data-cache prefetch instructions to improve data-cache
performance
-ece, --epilog-code-eliminate
Reduce code size by grouping common instructions in function epilogs,
into a single unified code
-fatc <num_of_bytes>, --fat-const <num_of_bytes>
Inflate constant areas in code section by adding <num_of_bytes> (entire
set to 255) to each constant area
-fatd <num_of_bytes>, --fat-data <num_of_bytes>
Inflate data section by adding <num_of_bytes> (entire set to 255) to
each data basic unit
-fatn <num_of_nops>, --fat-nop <num_of_nops>
Inflate code secion by adding <num_of_nop> to each code basic block
-bined < binary_editor>, --binary-editor < binary_editor>
Edit existing binary code (advanced option)
-hr, --hco-reschedule
Relocate instructions from frequently executed code to rarely executed
code areas, when possible
-hrf <factor>, --hco-resched-factor <factor>
Set the aggressiveness of the -hr optimization option according to a
factor value between (0,1), where 0 is the least aggressive factor
(applicable only with the -hr option)
-tasr, --toc-anchor-store-reschedule
Relocate TOC store instructions from frequently executed code to rarely
executed code areas, when possible
-i, --inline
Same as --selective-inline with --inline-small-funcs 12
-ihf <pct>, --inline-hot-functions <pct>
Inline all function call sites to functions that have a frequency count
greater than the given <pct> frequency percentage
-isf <size>, --inline-small-funcs <size>
Inline all functions that are smaller than or equal to the given <size>
in bytes
-kr, --killed-registers
Eliminate stores and restores of registers that are killed
(overwritten) after frequently executed function calls
-lap, --load-address-propagation
Eliminate load instructions of variable addresses by re-using
pre-loaded addresses of adjacent variables
-las, --load-after-store
Add NOP instructions to place each load instruction further apart
following a store instruction that references the same memory address
-plas, --pattern-based-load-after-store
Optimizes inefficient memory access patterns in order to avoid
load-after-store events.
-ebplas, --event-based-pattern-based-load-after-store
Optimizes inefficient memory access patterns in order to avoid
load-after-store events. The optimization is possible if
PM_MRK_LSU_REJECT_LHS profile is available
-rcl, --remove-constant-load
Reduces the number of load instructions used to bring constant values
into registers. The parameter is used to control which version of
optimization is applied, versions from 0 to 3 are available.
-pvgc <mode>, --print-visual-graph-csect <mode>
Print a .dot file with CFG information for each csect. Mode 0 is for a
graph containing full instructions list for each node, 1 is for a
graph with short nodes description.
-pvgf <mode>, --print-visual-graph-func <mode>
Print a .dot file with CFG information for each function. Mode 0 is for
a graph containing full instructions list for each node, 1 is for a
graph with short nodes description.
-lro, --link-register-optimization
Eliminate saves and restores of the link register in
frequently-executed functions
-lu <aggressiveness_factor>, --loop-unroll <aggressiveness_factor>
Unroll short loops containing one to several basic blocks according to
an aggressiveness factor between (1,9), where 1 is the least
aggressive unrolling option for very hot and short loops
-lun <unrolling_number>, --loop-unrolling-number <unrolling_number>
Set the number of unrolled iterations in each unrolled loop. The
allowed range is between (2,50). Default is set to 2. (Applicable only
with the -lu flag)
-lux <unrolling_factor>, --loop-unroll-extended <unrolling_factor>
Unroll hot loops using given unrolling factor. The allowed values are
integer numbers that are power of 2. Value -1 disables the
optimization, value 1 calculates the unrolling factor automatically,
given a machine model
-nop, --nop-removal
Remove NOP instructions from reordered code
-sls, --store-load-on-stack-opt
Optimize store load on stack pattern
-fmrx, --fmr-to-xxlor
Replace FMR instructions from reordered code with XXLOR instruction
-xscpx, --xscpsgndp-to-xxlor
Replace Xscpsgndp instructions from reordered code with XXLOR
instruction
-divopt, --divide-optimization
Replace fdiv/fdivs instructions with fre + fmul/fmuls instructions
-tslopt, --toc-store-in-loop-optimization
Remove toc store instructions from the loop and place toc store
instruction before loop
-ifopt, --instruction_fusion_optimization
put together two instructions suitable for fusion
-liopt, --loop_invariant_optimization
move loop invariant instructions out of the loop
-sfopt, --simple_functions_opt
inlining of the simple functions (isascii, isdigit)
-dir, --dependant-instr-resched
Put NOP between dependant instructions
-O Switch on basic optimizations only. Same as -RC -nop -bp -bf
-O2 Switch on less aggressive optimization flags. Same as -O -hr -pto -isf
8 -tlo -kr -see 0
-O3 Switch on aggressive optimization flags. Same as -O2 -RD -isf 12 -si
-lro -las -vro -btcar (for XCOFF files) -lu 9 -rt 0 -so -see 1 -oderat
-tslopt
-O4 Switch on aggressive optimization flags together with aggressive
function inlining. Same as -O3 -sidf 50 -ihf 20 -sdp 9 -shci 90 and
-bldcg (for XCOFF files)
-ocvp, --opt-call-value-profiling
specialize function calls according to the values of their passed
parameters
-omullX, --mullX-optimization
Optimize mullX instructions by adding a run-time check on RA and RB and
performing equivalent operations with lower penalty. The optimization
requires the use of -imullX in the instrumentation phase
-oderat, --derat-optimization
Optimize load/store indexed instructions by adding a run-time check on
RA and RB and performing equivalent operations with lower penalty. The
optimization requires the use of -iderat in the instrumentation phase
-pbsi, --path-based-selective-inline
Perform selective inlining of dominant hot function calls based on the
control flow paths leading to hot functions
-pca, --propagate-constant-area
Relocate the constant variables area to the top of the code section
when possible
-pr/-nopr, --ptrgl-r11/--noptrgl-r11
Perform/Do not perform removal of R11 load instruction in _ptrgl csect
(the default is to perform the optimization)
-pto, --ptrgl-optimization
Perform optimization of indirect call instructions via registers by
replacing them with conditional direct jumps
-ptoht <heatness_threshold>, --ptrgl-optimization-heatness-threshold <heatness_threshold>
Set the frequency threshold for indirect calls that are to be optimized
by -pto optimization. Allowed range between 0 and 1. Default is set to
0.8. (Applicable only with -pto flag)
-ptosl <limit_size>, --ptrgl-optimization-size-limit <limit_size>
Set the limit of the number of conditional statements generated by -pto
optimization. Allowed values are between 1 and 100. Default value is
set to 3. (Applicable only with the -pto flag)
-RC, --reorder-code
Perform code reordering
-rcaf <aggressiveness_factor>, --reorder-code-aggressivenes-factor <aggressiveness_factor>
Set the aggressiveness of code reordering optimization. Allowed values
are [0 | 1 | 2], where 0 preserves then original code order and 2 is
the most aggressive. Default is set to 1. (Applicable only with the
-RC flag)
-rccrf <reversal_factor>, --reorder-code-condition-reversal-factor <reversal_factor>
Set the threshold fraction that determines when to enable condition
reversal for each conditional branch during code reordering. Allowed
input range is between 0.0 and 1.0 where 0.0 tries to preserve
original condition direction and 1.0 ignores it. Default is set to 0.8
(Applicable only with the -RC flag)
-rcctf <termination_factor>, --reorder-code-chain-termination-factor <termination_factor>
Set the threshold fraction that determines when to terminate each chain
of basic blocks during code reordering. Allowed input range is between
0.0 and 1.0 where 0.0 generates long chains and 1.0 creates single
basic block chains. Default is set to 0.05. (Applicable only with the
-RC flag)
-RD, --reorder-data
Perform static data reordering
-ippcf, --instrument-for-path-profiling
Perform cross function path profiling instrumentation
-ppcf, --optimize-with-path-profiling
Perform cross function path profiling optimization
-rmte, --remove-multiple-toc-entries
Remove multiple TOC entries pointing to the same location in the input
program file
-rt <removal_factor>, --reduce-toc <removal_factor>
Perform removal of TOC entries according to a removal factor between
(0,1), where 0 removes non-accessed TOC entries only and 1 removes all
possible TOC entries
-rtb, --remove-traceback-tables
Remove traceback tables in reordered code
-sdp <aggressiveness_factor>, --stride-data-prefetch <aggressiveness_factor>
Perform data prefetching within frequently executed loops based on
stride analysis, according to an aggressiveness factor between (1,9),
where 1 is the least aggressive
-sdpila <instructions_number>, --stride-data-prefetch-instruction-look-ahead <instructions_number>
Set the number of instructions for which data is prefetched into the
cache ahead of time. Default value is platform dependant. (Applicable
only with the -sdp flag)
-sdpms <stride_min_size>, --stride-data-prefetch-min-size <stride_min_size>
Set the minimal stride size in bytes, for which data will be considered
a candidate for prefetching. Default value is set to 128 bytes.
(Applicable only with the -sdp flag)
-ebp <evt_based_prefetch>, --event-based-prefetch <evt_based_prefetch>
Perform data prefetching based on the events file
-ebpla <instructions_number>, --event-based-prefetch-look-ahead <instructions_number>
Set the number of instructions for which event based prefetch is
performed. Default value is platform dependant. (Applicable only with
the -ebp flag)
-vecopt
Use vector optimizations(remove double xxswapd, remove redundant load,
remove xxlnand and replace data with complemented data, replace lxvd2x
from rodata and xxswapd by lvx)
-see <level>
Use simplified prolog/epilog for functions that perform conditional
early-exit. Use basic optimization with <level>=0 and maximal with
<level>=1
-shci <pct>, --selective-hot-code-inline <pct>
Perform selective inlining of functions in order to decrease the total
number of execution counts, so that only functions with hotness above
the given percentage are inlined
-si, --selective-inline
Perform selective inlining of dominant hot function calls
-chca, --convert_hole_to_constareas
Convert Holes In SafeCSects To ConstAreas
-sidf <percentage_factor>, --selective-inline-dominant-factor <percentage_factor>
Set a dominant factor percentage for selective inline optimization. The
allowed range is between 0 and 100. Default is set to 80. (Applicable
only with the -si and -pbsi flags)
-siht <frequency_factor>, --selective-inline-hotness-threshold <frequency_factor>
Set a hotness threshold factor percentage for selective inline
optimization to inline all dominant function calls that have a
frequency count greater than the given frequency percentage. Default
is set to 100. (Applicable only with the -si -pbsi flags)
-slbp, --spinlock-branch-prediction
Perform branch prediction bit setting for conditional branches in
spinlock code containing l*arx and st*cx instructions. (Applicable
after -bp flag)
-sldp, --spinlock-data-prefetch
Perform data prefetching for memory access instructions preceding
spinlock code containing l*arx and st*cx instructions
-sll <Lib1:Prof1,...,LibN:ProfN>, --static-link-libraries <Lib1:Prof1,...,LibN:ProfN>
Statically link hot code from specified dynamically linked libraries to
the input program. The parameter consists of a comma-separated list of
libraries and their profiles. IMPORTANT: Licensing rights of specified
libraries should be observed when applying this copying optimization
-sllht <hotness_threshold>, --static-link-libraries-hotness-threshold <hotness_threshold>
Set hotness threshold for the --static-link-libraries optimization. The
allowed input range is between 0 (least aggressive) and 1, or -1,
which does not require a profile and selects all code that might be
called by the input program from the given libraries. Default is set
at 0.5
-so, --stack-optimization
Reduce the stack frame size of functions that are called with a small
number of arguments
-spc, --shortcut-plt-calls
Shortcut PLT calls in shared libraries to local functions if they
exist. Note: Resolving to external symbols is disabled for such calls
-tb, --preserve-traceback-tables
Force the restructuring of traceback tables in reordered code. If -tb
option is omitted, traceback tables are automatically included only
for C++ applications that use the Try & Catch mechanism
-tlo, --tocload-optimization
Replace each load instruction that references the TOC with a
corresponding add-immediate instruction via the TOC anchor register,
where possible
-vro, --volatile-registers-optimization
Eliminate stores and restores of non-volatile registers in frequently
executed functions by using available volatile registers
-vrox, --volatile-registers-extended-optimization
Eliminate stores and restores of non-volatile registers in frequently
executed functions by using available volatile registers, the extended
version supports FP registers and transparency
Output Options:
-cep, --complement-edge-profile
Complements partial profile information given for the basic blocks'
frequencies by adding missing basic block-to-basic block edge counts
-d, --disassemble-text
Print the disassembled text section of the output program into
<output_file>.dis_text file
-dap, --dump-ascii-profile
Dump profile information in ASCII format into <program>.aprof (requires
the -f flag).
-db, --disassemble-bss
Print the disassembled bss section of the output program into
<output_file>.dis_bss file
-dd, --disassemble-data
Print the disassembled data section of the output program into
<output_file>.dis_data file
-diap, --dump-initial-ascii-profile
Dump the given profile information in ASCII format into
<program>.aprof.init (requires the -f flag)
-dim, --dump-instruction-mix
Dump instruction mix statistics based on gathered profile information
-dm, --dump-mapper
Print a map of basic blocks and static variables with their respective
new -> old addresses into a <program>.mapper file
-o <output_file>, --output-file <output_file>
Set the name of the output file. The default instrumented file is
<program>.instr. The default optimized file is <program>.fdpr
-scl, --show-constant-load
Adds annotaions in fdpr disassembly on load instructions used to bring
constant values into registers (requires -d flag)
-ppcf, --print-prof-counts-file
Print a text format of the profiling counters into a <program>.counts
file (requires the -f flag).
-sf, --strip-file
Strip the output file
-simo, --single-input-multiple-outputs
Optimize in parallel into multiple outputs as specified by option sets
read from stdin
General Options:
-h, --help
Print the online help
-j <jour_file>, --journal <jour_file>
Output optimization journal information to <jour_file>
-smt, --smt_mode
set SMT mode (1:ST, 2: (SMT2-shared, SMT2-split), 4:SMT4, 8:SMT8)
-m <machine-model>, --machine <machine-model>
Generate code for the specified machine model. Target machine can be
one of the following models: power2, power3, ppc405, ppc440, power4,
ppc970, power5, power6, power7, ppe, spe, spe_edp, z10, z9. Default is
power7
-q, --quiet
Set the output mode to quiet, suppressing informational messages
-st <stat_file>, --statistics <stat_file>
Output statistics information to <stat_file>. If <stat_file> is '-',
the output goes to the standard output. See --verbose for the default
-v <level>, --verbose <level>
Set verbose output mode level. When set, various statistics about the
output program are printed into the file <program>.stat. Allowed level
range is between 0 and 3. Default is set to 0
-V, --version
Print the version number
-w <level>, --warning-level <level>
Set the warning level so only errors of this level and below will be
printed. The levels are: 1: errors, 2: warnings, 3: debug warning, 4:
debug information. Default is 2
- Analysis options should be specified identically in the instrumentation and optimization phases
- Some options are relevant only to specific platforms
XLFRTEOPTS=intrinthrds=1 : Causes the Fortran runtime to only use a single thread. LD_PRELOAD=/opt/at11.0/lib64/libhugetlbfs.so : By preloading the Hugepage library, it can back malloc() and shared memory, and text and data segments can be partially backed if they are large enough. TCMALLOC_MEMFS_MALLOC_PATH=/dev/hugepages/ : If set, specify a path where hugetlbfs or tmpfs is mounted. This may allow for speedier allocations. MALLOC_MMAP_MAX_=0 : when combined, MALLOC_TRIM_THRESHOLD and MALLOC_MMAP_MAX force MALLOC to use SBRK() rather than MMAP() to allocate memory. This improves performance, but it may reduce the total amount of memory available to your user processes (to no more than 1 Gbyte/process).
echo 11520 > /proc/sys/vm/nr_hugepages
You can also use the environment variables below to manage huge pages behavior: HUGETLB_VERBOSE=0 : Turn off any debugging message from libhugetlbfs HUGETLB_MORECORE=yes: Instructs libhugetlbfs to override libc's normal morecore() function with a hugepage version and use it for malloc(). HUGETLB_MORECORE_HEAPBASE=0x50000000: Specifies that the hugepage heap address to start at 0x50000000. HUGETLB_ELFMAP=R ; Instructs libhugetlbfs to place text segment in hugepages. HUGETLB_ELFMAP=W ; Instructs libhugetlbfs to place data and BSS segments in hugepages. HUGETLB_ELFMAP=RW ; Instructs libhugetlbfs to place all segments in hugepages. HUGETLB_ELFMAP=no ; Instructs libhugetlbfs not to place any segment in hugepages.
Power and Performance Mode is settable at the Advanced System Management menu that controls the trade-offs between power efficiency, frequency, and consistency. Four modes are available:
The processor clock frequency will be set to its fixed, nominal value.
Enabling this feature reduces power consumption by lowering the processor clock frequency and voltage to fixed values. This reduces the power consumption of the system while delivering predictable performance.
Enabling this feature causes the processor frequency to vary based on workload and active core count. As the workload/active core count decreases, the processor uses less power, which enables the frequency to be increased above nominal. During periods of very low utilization, the processor frequency will be reduced in order to save energy. This mode provides consistent performance across all environmental operating conditions.
Enabling this feature causes the processor frequency to vary based on workload and active core count. As the workload/active core count decreases, the processor uses less power, which enables the frequency to be increased above nominal. In this mode, the allowed socket power is increased to the maximum value, which results in top performance along with increased fan noise and higher power consumption. In more stressful environmental conditions, performance may vary. This is the default mode.
Idle Power Saver is an option that can be combined with Maximum Performance Mode, Dynamic Performance Mode, and Disable All Modes to allow the system to drop to a frequency level below nominal frequency under programmable idle circumstances.
Two Speculative Execution Control knobs are available:
Default selection is "Speculative execution controls to mitigate user-to-kernel and user-to-user side-channel attacks".
Flag description origin markings:
For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact info@spec.org
Copyright 2017-2018 Standard Performance Evaluation Corporation
Tested with SPEC CPU2017 v1.0.5.
Report generated on 2018-10-31 18:41:13 by SPEC CPU2017 flags formatter v5178.