Description of compiler flags for Intel C++ Compiler 8.1 for IA-32, EM64T and Itanium architectures Architecture specific flags and actions are noted, where appropriate. -------------------------------------------------------- -O1 Optimize to favor code size and code locality. Disables loop unrolling. -O1 may improve performance for applica- tions with very large code size, many branches, and exe- cution time not dominated by code within loops. In most cases, -O2 is recommended over -O1. -O2 (DEFAULT) Optimize for code speed. This is the generally recom- mended optimization level. IPF: Turn software pipelining ON. -O3 Enable -O2 optimizations and in addition, enable more aggressive optimizations such as loop and memory access transformation. The -O3 optimizations may slow down code in some cases compared to -O2 optimizations. Recommended for applications that have loops with heavy use of float- ing point calculations and process large data sets. -ax generate code specialized for processor extensions specified by while also generating generic code. includes one or more of the following characters: i Pentium Pro and Pentium II processor instructions M MMX(TM) instructions K streaming SIMD extensions (implies i and M above) W Pentium 4 processor with Streaming SIMD Extensions 2 (implies i, M and K) N Pentium 4 processor with Streaming SIMD Extensions 2 P Pentium 4 processor with Streaming SIMD Extensions 3 -x generate specialized code to run exclusively on processors supporting the extensions indicated by as described above. ---------------------------------------------------------------------------------- Additional Notes for EM64T: ---------------------------------------------------------------------------------- On IntelŪ EM64T systems, -axW and -axP are the only valid options. ---------------------------------------------------------------------------------- ---------------------------------------------------------------------------------- Additional Notes on -xP: ---------------------------------------------------------------------------------- -xP The -xP option targets your program to run on Intel Pentium 4 and compatible Intel processors. The resulting code might contain unconditional use of features that are not supported on other processors. Programs, where the function main() is compiled with this option, will detect non compatible processors and generate an error message during execution. This option also enables new optimizations in addition to Intel processor specific optimizations. These options also enable advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors. ---------------------------------------------------------------------------------- -Ob{0|1|2} Controls the compiler's inline expansion. 0: disable inlining. 1: disables inlining unless -ip or -Ob2 are specified. 2: enables inlining of any function. However, the compiler decides which functions are inlined. This option enables interprocedural optimizations and has the same effect as specifying the -ip option. -IPF_fp_relaxed[-] (IPF only) Enable [disable] use of faster but slightly less accurate code sequences for math functions, such as divide and square root. -ip enable single-file IP optimizations (within files, same as -Ob2) -ipo multi-file ip optimizations that includes: - inline function expansion - interprocedural constant propogation - dead code elimination - propagation of function characteristics - passing arguments in registers - loop-invariant code motion -fast The -fast option maximizes speed across the entire pro- gram. It sets command options that can improve run-time performance, as follows: For Itanium-based systems, -fast sets -O3, -ipo, and -static. For IA-32 and IntelŪ EM64T systems, -fast sets -O3, -ipo, -static, and -xP. Note that on IA-32 and IntelŪ EM64T systems, programs compiled with the -xP option will detect non-compatible processors and generate an error message during execution. -ansi_alias Directs the compiler to assume that the program adheres to the type-based aliasing rules defined in Section 6.5 of the ISO C Standard. If your program adheres to these rules, this option will allow the compiler to optimize more aggressively. If it doesn't adhere to these rules, it can cause the compiler to generate incorrect code. -auto_ilp32 specify that the application cannot exceed a 32-bit address space -cxxlib-icc Directs the Intel compiler to use the C++ run-time libraries and C++ header files included with the Intel compiler. They include: libcprts standard C++ headers libcprts standard C++ library libcxa and libunwind C++ language support -prof_gen instrument program for profiling for the first phase of two-phase profile guided otimization -prof_use Instructs the compiler to produce a profile-optimized executable and merges available dynamic information (.dyn) files into a pgopti.dpi file. If you perform multiple executions of the instrumented program, -prof_use merges the dynamic information files again and overwrites the previous pgopti.dpi file. Without any other options, the current directory is searched for .dyn files -alias_args[-] Assume arguments may be aliased. (DEFAULT) [not aliased]. ------------------------------------------------------------- Description of compiler flags for Intel FORTRAN Compiler 8.1 for IA-32, EM64T and Itanium architectures Architecture specific flags and actions are noted, where appropriate. -------------------------------------------------------- -O1 Maximize speed; disables some optimizations that increase code size for a small speed benefit. This option enables global optimization. This includes data-flow analysis, code motion, strength reduction and test replacement, split-lifetime analysis, and instruction scheduling. Specifying -O2 includes the optimizations performed by -O1. Note that, on IA-32 systems, -O1 and -O2 are equivalent. -O2 (DEFAULT) Minimizes size; optimizes for speed, but disables some optimizations that increase code size for a small speed benefit; for the ItaniumŪ compiler, -O1 turns off software pipelining to reduce code size. This option enables local optimizations within the source program unit, recognition of common subexpressions, and expansion of integer multiplication and division using shifts. -O3 Maximize speed plus use higher-level optimizations; optimizations include loop transformation, software pipelining, and (IA-32 only) prefetching; this option may not improve performance for some programs. Specifying -O3 includes the optimizations performed by -O2. This option enables additional global optimizations that improve speed (at the cost of extra code size). These optimizations include: o Loop unrolling, including instruction scheduling o Code replication to eliminate branches o Padding the size of certain power-of-two arrays to allow more efficient cache use. -ax generate code specialized for processor extensions specified by while also generating generic code. includes one or more of the following characters: i Pentium Pro and Pentium II processor instructions M MMX(TM) instructions K streaming SIMD extensions (implies i and M above) W Pentium 4 processor with Streaming SIMD Extensions 2 (implies i, M and K) N Pentium 4 processor with Streaming SIMD Extensions 2 P Pentium 4 processor with Streaming SIMD Extensions 3 -x generate specialized code to run exclusively on processors supporting the extensions indicated by as described above. ---------------------------------------------------------------------------------- Additional Notes for EM64T: ---------------------------------------------------------------------------------- On IntelŪ EM64T systems, -axW and -axP are the only valid options. ---------------------------------------------------------------------------------- ---------------------------------------------------------------------------------- Additional Notes on -xP: ---------------------------------------------------------------------------------- -xP The -xP option targets your program to run on Intel Pentium 4 and compatible Intel processors. The resulting code might contain unconditional use of features that are not supported on other processors. Programs, where the function main() is compiled with this option, will detect non compatible processors and generate an error message during execution. This option also enables new optimizations in addition to Intel processor specific optimizations. These options also enable advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors. ---------------------------------------------------------------------------------- -Ob{0|1|2} Controls the compiler's inline expansion. 0: disable inlining. 1: disables inlining unless -ip or -Ob2 are specified. 2: enables inlining of any function. However, the compiler decides which functions are inlined. This option enables interprocedural optimizations and has the same effect as specifying the -ip option. -IPF_fp_relaxed (IPF only) Enables use of faster but slightly less accurate code sequences for math functions, such as divide and sqrt. When compared to strict IEEE* precision, this option slightly reduces the accu- racy of floating-point calculations performed by these func- tions, usually limited to the least significant digit. -ip enable single-file IP optimizations (within files, same as -Ob2) -ipo multi-file ip optimizations that includes: - inline function expansion - interprocedural constant propogation - dead code elimination - propagation of function characteristics - passing arguments in registers - loop-invariant code motion -fast The -fast option maximizes speed across the entire pro- gram. It sets command options that can improve run-time performance, as follows: For Itanium-based systems, -fast sets -O3, -ipo, and -static. For IA-32 and IntelŪ EM64T systems, -fast sets -O3, -ipo, -static, and -xP. Note that on IA-32 and IntelŪ EM64T systems, programs compiled with the -xP option will detect non-compatible processors and generate an error message during execution. -ansi_alias Enables (default) or disables the compiler to assume that the program adheres to the ANSI Fortran type aliasablility rules. For example, an object of type real cannot be accessed as an integer. You should see the ANSI standard for the complete set of rules -prof_gen instrument program for profiling for the first phase of two-phase profile guided otimization -prof_use Instructs the compiler to produce a profile-optimized executable and merges available dynamic information (.dyn) files into a pgopti.dpi file. If you perform multiple executions of the instrumented program, -prof_use merges the dynamic information files again and overwrites the previous pgopti.dpi file. Without any other options, the current directory is searched for .dyn files -scalar_rep(-) Enables(disables) scalar replacement performed during loop transformations (requires /O3). -auto Causes all variables to be allocated on the stack, rather than in local static storage. Does not affect variables that appear in an EQUIVALENCE or SAVE statement, or those that are in COMMON. Makes all local variables AUTOMATIC, same as /4Ya. Portability options for CPU2000: ------------------------------- 178.galgel: -FI Fixed-format F90 source code. 187.facerec: srcalt=AllocShape This src.alt adds code to checks that the allocatable array FTemp is allocated before calling SHAPE on it. 176.gcc: -Dalloca=_alloca Replace occurrences of alloca() with _alloca() -Dalloca=_alloca Replace occurrences of alloca() with _alloca() -DUSG Specify that the programming environment is like System V Unix systems srcalt=64bitgcc35 This src.alt eliminates the use of the cast as lvalue extension, which allows 176.gcc to be built on systems using GCC 3.5 or later. 186.crafty: -DLINUX_i386 Linux Intel system, use "long long" as 64bit variable. 252.eon: -DHAS_ERRLIST Tells that the system provides the "sys_nerr" and "sys_errlist[]" variables srcalt=fmax_errno This is needed for systems using GNU GLIBC 2.3.2, or any other compilation environment where errno is not exported by default or where the system definition of fmax differs from that in 252.eon. srcalt=stdcpp This src.alt addresses issues with lack of legacy header files as well as the old fmax() problem. 253.perlbmk: -DSPEC_CPU2000_LINUX_I386 Enable the code changes for porting to Linux on i386 architecture to be utilized -DSPEC_CPU2000_NEED_BOOL Use SPEC provided definition of the boolean type -DSPEC_CPU2000_GLIBC22 Compatibility with 2.2 & later versions of glibc -DSPEC_CPU2000_LP64 Compile using LP64 programming model. 254.gap: -DSYS_IS_USG Tells that the operating system is USG compliant -DSYS_HAS_IOCTL_PROTO Do not explicitly declare ioctl() -DSYS_HAS_TIME_PROTO Do not explicitly declare time(). -DSYS_HAS_SIGNAL_PROTO Do not explicitly #include -DSYS_HAS_ANSI System is ANSI compliant. -DSYS_HAS_CALLOC_PROTO Do not supply a prototype for calloc(). -DSPEC_CPU2000_LP64 Compile using LP64 programming model. 255.vortex: -DSPEC_CPU2000_LP64 Compile using LP64 programming model. srcalt=closed_files Original code attempted a final write using file pointers which had just been closed. Those file pointers are set to null in the approved source alt.