IBM-Linux-XL
Linux on Power with IBM XL Compilers SPEC CPU 2006 Flags
Compilers: IBM XL C/C++ Advanced Edition for Linux V9.0 and XL Fortran Advanced Edition for Linux V11.1
Compilers: IBM XL C/C++ for Linux V10.1 and XL Fortran for Linux V12.1
Compilers: IBM XL C/C++ for Linux V11.1 and XL Fortran for Linux V13.1
Compilers: IBM XL C/C++ for Linux V12.1 and XL Fortran for Linux V14.1
Operating systems: SUSE Linux Enterprise 10, SUSE Linux Enterprise 11, Red Hat Enterprise Linux Advanced Platform 5, and Red Hat Enterprise Linux Server release 6
Last updated: 2-Oct-2012
]]>
exampleOFxlc
Invoke the IBM XL C compliler. 32-bit binaries are produced by default.
]]>
exampleOFxlC
Invoke the IBM XL C++ compliler. 32-bit binaries are produced by default.
]]>
exampleOFxlf95
Invoke the IBM XL Fortran compliler. 32-bit binaries are produced by default.
]]>
exampleOFxlf95
Invoke the IBM XL Fortran compliler with the 'r' capabilities.
]]>
fdpr -O3
Invoke the IBM fdpr FDO program to do FDO optimizations on a binary module.
]]>
-O5
Perform optimizations for maximum performance. This includes maximum
interprocedural analysis on all of the objects presented on the "link"
step. This level of optimization will increase the compiler's memory
usage and compile time requirements. -O5 Provides all of the functionality
of the -O4 option, but also provides the functionality of the
-qipa=level=2 option.
-O5 is equivalent to the following flags
- -O4
- -qipa=level=2
- -qarch=auto
- -qtune=auto
]]>
-O4
Perform optimizations for maximum performance. This includes
interprocedural analysis on all of the objects presented on the "link"
step.
-O4 is equivalent to the following flags
- -O3
- -qipa=level=1
- -qarch=auto
- -qtune=auto
]]>
-O3
Performs additional optimizations that are memory intensive, compile-time
intensive, and may change the semantics of the program slightly, unless
-qstrict is specified. We recommend these optimizations when the desire for
run-time speed improvements outweighs the concern for limiting compile-time
resources. The optimizations provided include:
- In-depth memory access analysis
- Better loop scheduling
- High-order loop analysis and transformations (-qhot=level=0)
- Inlining of small procedures within a compilation unit by default
- Eliminating implicit compile-time memory usage limits
- Widening, which merges adjacent load/stores and other operations
- Pointer aliasing improvements to enhance other optimizations
-O3 is equivalent to the following flags
]]>
-O2
Performs a set of optimizations that are intended to offer improved
performance without an unreasonable increase in time or storage that is
required for compilation including:
- Eliminates redundant code
- Basic loop optimization
- Can structure code to take advantage of -qarch and -qtune settings
]]>
-O
Enables the level of optimization that represents the best tradeoff between compilation speed and run-time performance. If you need a specific level of optimization, specify the appropriate numeric value. Currently, -O is equivalent to -O2.
]]>
-qhot
Performs high-order transformations on loops during optimization.
o arraypad
The compiler will pad any arrays where it infers that there may be a benefit.
o level=0
The compiler performs a limited set of high-order loop transformations.
o level=1
The compiler performs its full set of high-order loop transformations.
o simd
Replaces certain instruction sequences with vector instructions.
o vector
Replaces certain instruction sequences with calls to the MASS library.
Specifying -qhot without suboptions implies -qhot=nosimd, -qhot=noarraypad, -qhot=vector and -qhot=level=1. The -qhot option is also implied by -O4, and -O5.
]]>
-qarch=pwr5x, -qarch=auto
Produces object code containing instructions that will run on the
specified processors. "auto" selects the processor the complile
is being done on. "pwr5x" is the POWER5+ processor.
Supported values for this flag are
- auto - Use the processor on which the program is compiled.
- pwr7 - The POWER7 processor based systems.
- pwr6e - The POWER6 processor in "Enhanced" mode based systems.
- pwr6 - The POWER6 processor based systems.
- pwr5x - The POWER5+ processor based systems.
- pwr5 - The POWER5 processor based systems.
- pwr4 - The POWER4 processor based systems.
- ppc970 - The PPC970 processor based systems.
]]>
-qtune=pwr4, -qtune=auto
Specifies the architecture system for which the executable program
is optimized. This includes instruction scheduling and cache setting.
The supported values for suboption are
- auto - Use the processor on which the program is compiled.
- pwr7 - The POWER7 processor based systems.
- pwr6 - The POWER6 processor based systems.
- pwr5x - The POWER5+ processor based systems.
- pwr5 - The POWER5 processor based systems.
- pwr4 - The POWER4 processor based systems.
- ppc970 - The PPC970 processor based systems.
]]>
-qipa=level
Enhances optimization by doing detailed analysis across procedures
(interprocedural analysis or IPA).
The level determines the amount of interprocedural analysis
and optimization that is performed.
level=0 Does only minimal interprocedural analysis and optimization
level=1 turns on inlining , limited alias analysis, and limited
call-site tailoring
level=2 turns on full interprocedural data flow and alias analysis
]]>
-qalias=noansi
qalias=ansi | noansi
If ansi is specified, type-based aliasing is
used during optimization, which restricts the
lvalues that can be safely used to access a
data object. The default is ansi for the xlc,
xlC, and c89 commands. This option has no
effect unless you also specify the -O option.
qalias=std |nostd
Indicates whether the compilation units contain
any non-standard aliasing (see Compiler Reference
for more information). If so, specify nostd.
]]>
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
Indicates that the input fortran source program is in fixed form.
Adds an underscore to global entites to match the C compiler ABI
Causes the compiler to treat "char" variables as signed instead of the
default of unsigned.
Indicates that the compiler understands how to do alloca().
Generates 32 bit ABI binaries.
Generates 64 bit ABI binaries. The default is to generate 32 bit binaries.
Specifies what aggregate alignment rules the compiler uses for file compilation,
where the alignment options are:
bit_packed
The compiler uses the bit_packed alignment rules.
full
The compiler uses the RISC System/6000 alignment rules. This is the same
as power.
mac68k
The compiler uses the Macintosh alignment rules. This suboption is valid only
for 32- bit compilations.
natural
The compiler maps structure members to their natural boundaries.
packed
The compiler uses the packed alignment rules.
power
The compiler uses the RISC System/6000 alignment rules.
twobyte
The compiler uses the Macintosh alignment rules. This suboption is valid
only for 32-bit compilations. The mac68k option is the same as twobyte.
The default is -qalign=full.
]]>
Link the Fortran runtime library libxlf90_r.so which is required by libessl.so.
Link the mathematical acceleration subsystem libraries (MASS), which contain libraries of tuned mathematical intrinsic functions.
Link the Engineering and Scientifc Subroutine Library (ESSL), libessl.so.
ESSL is a collection of subroutines providing a wide range of performance-tuned mathematical functions for many common scientific and engineering applications. The mathematical subroutines are divided into nine computational areas:
- Linear Algebra Subprograms
- Matrix Operations
- Linear Algebraic Equations
- Eigensystem Analysis
- Fourier Transforms, Convolutions, Correlations and Related Computations
- Sorting and Searching
- Interpolation
- Numerical Quadrature
- Random Number Generation
]]>
Specifies that, if either -lessl or -lesslsmp are also specified, then Engineering and Scientific Subroutine Library (ESSL) routines should be used in place of some Fortran 90 intrinsic procedures when there is a safe opportunity to do so.
The option used in the first pass of a profile directed feedback compile that causes pdf information
to be generated. The profile directed feedback optimization gathers data on both exectuion path and
data values. It does not use hardware counters, nor gather any data other than path and data values
for PDF specific optimizations.
The option used in the second pass of a profile directed feedback compile that causes PDF information
to be utilized during optimization.
Support ISO C99 standard, and accepts implementation-specific language extensions.
Link with MicroQuill's SmartHeap (32-bit) library for Linux on POWER. This is a library that
optimizes calls to new, delete, malloc and free.
Link with MicroQuill's SmartHeap (64-bit) library for Linux on POWER. This is a library that
optimizes calls to new, delete, malloc and free.
Parameter |
Description |
Executable name |
a |
Assembler |
as |
b |
Low-level optimizer |
xlfcode |
c |
Compiler front end |
xlfentry |
d |
Disassembler |
dis |
F |
C preprocessor |
cpp |
h |
Array language optimizer |
xlfhot |
I |
High-level optimizer, compile step |
ipa |
l |
Linker |
ld |
z |
Binder |
bolt |
]]>
-qxlf90=nosignedzero
-qxlf90=<suboption>
Determines whether the compiler provides the
Fortran 90 or the Fortran 95 level of support for
certain aspects of the language. <suboption> can be
one of the following:
signedzero | nosignedzero
Determines how the SIGN(A,B) function handles
signed real 0.0. In addition, determines
whether negative internal values will be
prefixed with a minus when formatted output
would produce a negative sign zero.
autodealloc | noautodealloc
Determines whether the compiler deallocates
allocatable arrays that are declared locally
without either the SAVE or the STATIC
attribute and have a status of currently
allocated when the subprogram terminates.
oldpad | nooldpad
When the PAD=specifier is present in the
INQUIRE statement, specifying -qxlf90=nooldpad
returns UNDEFINED when there is no connection,
or when the connection is for unformatted I/O.
This behavior conforms with the Fortran 95
standard and above. Specifying -qxlf90=oldpad
preserves the Fortran 90 behavior.
Default:
o signedzero, autodealloc and nooldpad for the
xlf95, xlf95_r, xlf95_r7 and f95 invocation
commands.
o nosignedzero, noautodealloc and oldpad for
all other invocation commands.
]]>
qstrict
Turns off aggressive optimizations which have the potential to alter the
semantics of your program. -qstrict sets -qfloat=nofltint:norsqrt.
qnostrict
Sets -qfloat=rsqrt.
These options are only valid with -O2 or higher optimization levels.
Default:
o -qnostrict at -O3 or higher.
o -qstrict otherwise.
]]>
Controls how shared and non-shared runtime libraries are linked into an application.
When -qstaticlink is in effect, the compiler links only static libraries with the object file named in the invocation. When -qnostaticlink is in effect, the compiler links shared libraries with the object file named in the invocation.
This option provides the ability to specify linking rules that are equivalent to those implied by the GNU options -static, -static-libgcc, and -shared-libgcc, used singly and in combination.
Disables generation of vector instructions for processors that support them.
"-Wl,--wholearchive /usr/lib/libhugetlbfs.a"
Instructs the linker to include every object file in the specified library,
rather than searching the library for the required object files.
"/usr/lib/libdl.a"
Instructs the linker to include libdl.a to enable dynamic linking loader.
Turn off the effect of the --whole-archive flag.
Instructs the linker to allow multiple definitions and the first definition will
be used. Normally when a symbol is defined multiple times, the linker will report
a fatal error.
Pass the --hugetlbfs-link=BDT flag to the linker so that
the text, initialized data, and BSS segments of the application are backed by hugepages.
Pass the --hugetlbfs-align flag to the linker so that we can control
(by environment variable HUGETLB_ELFMAP) which program segments are placed in hugepages.
-B/usr/share/libhugetlbfs/
Determines substitute path names for XL Fortran executables such as the compiler, assembler, linker, and preprocessor. It can be used in combination with the -t option, which determines which of these components are affected by -B.
Pass the -q flag to the linker causing the final executable to have the relocation information.
This macro indicates that the benchmark is being compiled on a PowerPC-based Linux System.
Cause the C++ compiler to generate Run Time Type Identification code for exception handling and for use by the typeid and dynamic_cast operators.
Causes the Fortran compiler to allocate dynamic arrays on the heap instead of the stack
-qipa=noobject
Specifies whether to include standard object code in the object files.
The noobject suboption can substantially reduce overall
compilation time, by not generating object code during the first IPA phase.
This option does not affect the code in the final binary created.
]]>
-qipa=threads
The threads suboption allows the IPA optimizer to run portions
of the optimization process in parallel threads, which can speed up the
compilation process on multi-processor systems. All the available
threads, or the number specified by N, may be used. N must be a positive
integer. Specifying nothreads does not run any parallel threads;
this is equivalent to running one serial thread.
This option does not affect the code in the final binary created.
]]>
The path used to invoke the compilers.
-qsimd
-qnosimd
Enables the generation of vector instructions for processors
that support them.
-qassert=refalign
qassert=refalign | norefalign
Specifies that all pointers inside the compilation
unit only point to data that is naturally aligned
according to the length of the pointer types.
]]>
-qipa=inline=limit=1000
-qipa=inline=threshold=100
The inline suboption specifies the threshold and
limit of inlined functions
]]>
Link with the Apache C++ Standard Library ("stdcxx"). "libstd8d.so" is a 32-bit shared library with optimization enabled.
Adds the directory for the Apache C++ Standard Library to the search path at link time.
Specifies library search directory for the Apache C++ Standard Library for use by the runtime linker. The information is recorded in the object file and passed to the runtime linker.
]]>
Changes the default search path for the XL C++ header files to use the header files from Apache C++ Library.
-qipa=partition=large
The partition suboption specifies the size of the program
sections that are analysed together. Larger partitons may produce
better analysis but require more storage. Default is medium.
]]>