IBM Linux Flag Disclosure SPEC CPU2000 & OMP2001 Last Revised 17 August, 2004 Source Level Portability Options ================================ -DHOST_WORDS_BIG_ENDIAN (176.gcc) Host system is big-endian. -DLINUX_PPC32 (186.crafty) Sets some basic parameters like endian-ess, OS type, and ANSI language extensions to be compatible with a Linux system. -DHAS_ERRLIST (252.eon) Tells that the system provides the "sys_nerr" and "sys_errlist[]" variables in 252.eon. -DSPEC_CPU2000_LINUX_PPC32 (253.perlbmk) Compile the SPEC CPU2000 modified perl for a Linux system. -DSPEC_CPU2000_NEED_BOOL (253.perlbmk) Use SPEC provided definition of the boolean type -DHAS_FGETPOS (253.perlbmk) Tells that the system provides fgetpos routine to get the file position indicator. -DHAS_FSETPOS (253.perlbmk) Tells that the system provides fsetpos routine to set the file position indicator. -DSYS_STRING_H (254.gap) Do not explicitly include string.h -DSYS_IS_USG (254.gap) Tells that the operating system is USG compliant -DSYS_HAS_IOTCL_PROTO (254.gap) Do not explicitly declare ioctl() -DSYS_HAS_CALLOC_PROTO (254.gap) Do not supply a prototype for calloc(). -DHAVE_SIGNED_CHAR (300.twolf) System allows signed char type. Compiler Invocation =================== xlc Invokes the compiler for C source files with a default language level of ansi and specifies that it allow type-based aliasing. cc Invokes the compiler for C source files with a default language of extended and specifies that it provide compatibility with older IBM compilers and allow placement of string literals or constant values in read/write storage. cc does not conform to the ISO/ANSI C standard. xlc_r The same as "xlc" except that it generates a threadsafe executable, compliant with the POSIX pthreads API. xlf Invokes the compiler for Fortran source files with a default language of Fortran 77. xlf_r The same as "xlf" except that it generates a threadsafe executable, compliant with the POSIX pthreads API. xlf90 Invokes the compiler for Fortran source files with a default language of Fortran 90. xlf90_r The same as "xlf90" except that it generates a threadsafe executable, compliant with the POSIX pthreads API. Compiler Options ================ -ma Use built-in alloca() function. -O Performs optimizations that the compiler developers considered the best combination for compilation speed and runtime performance. -O3 Perform some memory and compile time intensive optimizations in addition to those executed with -O. The -O3 specific optimizations have the potential to slightly alter the semantics of a user's program. Optimizations may include, but are not limited to: Aggressive code motion, and scheduling on computations that have the potential to raise an exception; Relaxed conformance to IEEE rules in cases where the difference in the results is not important to an application; Rewriting of floating point expressions. -O4 Equivalent to -O3 -qipa -qhot with automatic generation of architecture ( -qarch= )and tuning ( -qtune= )options ideal for that platform. The qipa level defaults to level=1. -O5 Equivalent to -O3 -qipa=level=2 -qhot with automatic generation of architecture ( -qarch= ) and tuning ( -qtune= ) options ideal for that platform. -D_ILS_MACROS Defined in /usr/include/ctype.h to use the macro version of the string classification functions (e.g. isupper()). -Q, -qinline The -Q option without any list inlines all appropriate procedures, subject to limits on the number of inlined calls and the amount of code size increase as a result. -qinline is an alias for -Q. -Q=xxx Inline all functions that contain less than xxx lines of abstract code units. -q64 Selects 64-bit compiler mode. -q32 Selects 32-bit compiler mode. -qalign=natural The compiler maps structure members to their natural boundaries. -qansialias Use type-based aliasing during optimization -qarch=ppc Produces object code containing instructions that will run on any of the 32-bit PowerPC hardware platforms. -qarch=pwr3 Produces object code containing instructions that will run on power3 processors. -qarch=pwr4 Produces object code containing instructions that will run on power4/power4+ processors. -qarch=pwr5 Produces object code containing instructions that will run on power5 processors. -qarch=rs64b Produces object code containing instructions that will run on RS64-II processors. -qdatalocal Changes the default to assume that all variables ar local. -qessl Specifies that, if either -lessl or -lesslsmp are also specified, then Engineering and Scientific Subroutine Library (ESSL) routines should be used in place of some Fortran 90 intrinsic procedures when there is a safe opportunity to do so. -qlibessl Specifies that all functions whose names match ESSL library- functions are, in fact, the library functions. -qfdpr Collect information about programs for use with the AIX fdpr (Feedback Directed Program Restructuring) performance-tuning utility. -qfixed Indicates that the input source program is in fixed form. Allows fixed format Fortran 77 programs to be compiled using the xlf90 compiler invocation. -qfixed= States that Fortran code is in fixed source form, with optional argument specifying the maximum line length. -qfloat=rsqrt Changes a division by the result of a square root operation into a multiply by the reciprocal of the square root. -qhot Perform high-order transformations on loops during optimization. -qhot=arraypad Pad the sizes of arrays to align better in cache. -qipa=level=1 Turns on interprocedural analysis with inlining, limited alias analysis, and limited call-site tailoring. This is the default level of -qipa. -qipa=level=2 Turns on interprocedural analysis with inlining, cloning, full alias analysis, constant propagation, call-site tailoring, and dead code removal. -qipa=noobject Do not generate object files during the first stage of inter- procedural analysis. -qinline Alias for -Q. See -Q. -qipa=partition=large Specifies the size of the regions within the program to analyze. Larger partitions contain more procedures, which result in better interprocedural analysis but require more storage to optimize. -qlanglvl=ansi Compilation conforms to the ANSI standard. -qlargepage Indicates that a program, designed to execute in a large page memory environment, can take advantage of large 16 MB pages provided on POWER4 or better CPUs. -qmaxmem=-1 Allows the compiler to use as much memory as it needs to execute. -qpdf1/pdf2 Profile directed feedback optimization -qsave Sets the default storage class for local variables to STATIC. -qsmp=omp Enable OpenMP parallelization directives. -qsuffix=f=f90 Sets the suffix for source files to be .f90. The .f90 suffix is required by xlf90 to compile Fortran 90 programs. -qtune=604 Instruction selection, scheduling, and other implementation dependent performance enhancements for the PowerPC 604/604e processor. -qtune=pwr3 Instruction selection, scheduling, and other implementation dependent performance enhancements for the Power3 processor. -qtune=pwr4 Instruction selection, scheduling, and other implementation dependent performance enhancements for the Power4/Power4+ processors. -qtune=pwr5 Instruction selection, scheduling, and other implementation dependent performance enhancements for the POWER5 processors. -qtune=rs64b Instruction selection, scheduling, and other implementation dependent performance enhancements for the RS64-II processor. -qunroll=n Unrolls inner loops in th program by a factor of n. -w Suppress warning messages from the C, C++, and Fortran compilers. Linker Options ============== -lessl Link the Engineering and Scientifc Subroutine Library (ESSL). -qessl Link to the Engineering and Scientifc Subroutine Library (ESSL). -lpdf Routines used in the first pass of the profile directed feedback process. Routines from this library are not used in building the final executable. In newer compilers, -qpdf1 does this automatically, so using this in conjunction with -qpdf1 is redundant. Linux Environment Variables =========================== MALLOCMULTIHEAP=1 Maintains multiple heaps in the process, for servicing simultaneous "malloc" requests. OMP_DYNAMIC=FALSE Disables dynamic adjustment of the number of available threads. OMP_NUM_THREADS=... The exact number of threads available to be used, or if OMP_DYNAMIC is TRUE, the upper limit on the number of available threads. XLFRTEOPTS=NAMELIST=OLD Allows a newly compiled program to read the namelist from a binary compiled with the older namelist format. XLSMPOPTS A list of runtime settings affecting SMP execution. Here are some of the possibilities: SCHEDULE=STATIC Work is scheduled to threads round-robin. SPINS=0 Allows work-requests to spin indefinitely without the thread having to yield the time-slice. STACK=8000000 Specifies the largest allowable size of a thread's stack. YIELDS=0 Allows the thread to yield an indefinite number of times without being driven into a sleep state. Stack Size Information: ======================= Stack size set to unlimited using the command "ulimit -s unlimited".