Description of compiler flags for Intel C++ Compiler 9.1 / Itanium(R) used in Bull Windows submissions ------------------------------------------------------------------------------------------------------ Performance ----------- /O1 optimize for maximum speed, but disable some optimizations which increase code size for a small speed benefit. Also disables software pipelining and global code scheduling: /Gfsy /Ob1gysi- /Qunroll0 /O2 optimize for maximum speed (DEFAULT): /Gfsy /Ob1gyti /O3 optimize for maximum speed and enable high-level optimizations /Ox enable maximum optimizations: /Gs /Ob1gyti (same as /O2 without /Gfy) /Od disable optimizations; useful for selective optimizations (i.e. /Od /Og) /fast enable /O3 /Qipo /Ob control inline expansion: n=0 disables inlining n=1 inline functions declared with __inline, and perform C++ inlining n=2 inline any function, at the compiler's discretion (same as /Qip) /Og enable global optimizations /Oi[-] enable/disable inline expansion of intrinsic functions /Op[-] enable/disable better floating-point precision /Os enable speed optimizations, but disable some optimizations which increase code size for small speed benefit (overrides /Ot) /Ot enable speed optimizations (overrides /Os) /Oa[-] assume no aliasing in program /Ow[-] assume no aliasing within functions, but assume aliasing across calls /G1 optimize for Itanium(R) processor /G2 optimize for Itanium(R) 2 processor (DEFAULT) /GR[-] enable/disable C++ RTTI /GX[-] enable/disable C++ exception handling (/GX is same as /EHsc) /Qcxx_features enable standard C++ features (-GX -GR) /EHa enable asynchronous C++ exception handling model /EHs enable synchronous C++ exception handling model /EHc assume extern "C" functions do not throw exceptions /Ge enable stack checks for all functions /Gs[n] disable stack checks for functions with less than n bytes of locals /Gf enable string pooling optimization /GF enable read-only string pooling optimization /Gy separate functions for the linker (COMDAT) /GA optimize for Windows application (assume .exe) /GT enable fiber-safe thread local storage /Qnopic disable generation of position independent code /Ap64 assume 64-bit size for pointers (DEFAULT) /As32 assume 32-bit address space /As64 assume 64-bit address space (DEFAULT) Advanced Performance -------------------- Enable and specify the scope of Interprocedural (IP) Optimizations: /Qip enable single-file IP optimizations (within files, same as /Ob2) /Qipo[n] enable multi-file IP optimizations (between files) /Qipo_c generate a multi-file object file (ipo_out.obj) /Qipo_S generate a multi-file assembly file (ipo_out.asm) Modify the behavior of IP: /Qip_no_inlining disable IP inlining (requires /Qip or /Qipo) /Qipo_obj force generation of real object files (requires /Qipo) /Qipo_separate create one object file for every source file (overrides /Qipo[n]) Other Advanced Performance Options: /Qunroll0 disable loop unrolling /Qprof_dir specify directory for profiling output files (*.dyn and *.dpi) /Qprof_file specify file name for profiling summary file /Qprof_gen[x] instrument program for profiling; with the x qualifier, extra information is gathered for use with the PROFORDER tool /Qprof_use enable use of profiling information during optimization /Qfnsplit[-] enable/disable function splitting (enabled with /Qprof_use) /Qopt_report generate an optimization report to stderr /Qopt_report_file specify the filename for the generated report /Qopt_report_level[level] specify the level of report verbosity (min|med|max) /Qopt_report_phase specify the phase that reports are generated against /Qopt_report_routine reports on routines containing the given name /Qopt_report_help display the optimization phases available for reporting /Qtcheck generate instrumentation to detect multi-threading bugs (requires Intel(R) Threading Tools; cannot be used with compiler alone) /Qopenmp enable the compiler to generate multi-threaded code based on the OpenMP directives /Qopenmp_profile link with instrumented OpenMP runtime library to generate OpenMP profiling information for use with the OpenMP component of the VTune(TM) Performance Analyzer /Qopenmp_stubs enables the user to compile OpenMP programs in sequential mode. The openmp directives are ignored and a stub OpenMP library is linked (sequential) /Qopenmp_report{0|1|2} control the OpenMP parallelizer diagnostic level /Qparallel enable the auto-parallelizer to generate multi-threaded code for loops that can be safely executed in parallel /Qpar_report{0|1|2|3} control the auto-parallelizer diagnostic level /Qpar_threshold[n] set threshold for the auto-parallelization of loops where n is an integer from 0 to 100 /Qalias_args[-] enable(DEFAULT)/disable C/C++ rule that function arguments may be aliased; when disabling the rule, the user asserts that this is safe /Qansi_alias[-] enable/disable(DEFAULT) use of ANSI aliasing rules in optimizations; user asserts that the program adheres to these rules /Qcomplex_limited_range[-] enable/disable(DEFAULT) the use of the basic algebraic expansions of some complex arithmetic operations. This can allow for some performance improvement in programs which use a lot of complex arithmetic at the loss of some exponent range. /Qchkstk[-] enable/disable call _chkstk for every call to alloca() /Qivdep_parallel make ivdep directives mean no loop carried dependencies /Qserialize-volatile[-] enable/disable strict memory access ordering for volatile data object references /Qftz[-] enable/disable flush denormal results to zero /QIPF_fma[-] enable/disable the combining of floating point multiplies and add/subtract operations /QIPF_fltacc[-] enable/disable optimizations that affect floating point accuracy /QIPF_flt_eval_method0 floating point operands evaluated to the precision indicated by program /QIPF_fp_speculation enable floation point speculations with the following conditions: fast - speculate floating point operations (DEFAULT) safe - speculate only when safe strict - same as off off - disables speculation of floating-point operations /QIPF_fp_relaxed[-] enable/disable use of faster but slightly less accurate code sequences for math functions /Qauto_ilp32 specify that the application cannot exceed a 32-bit address space (/Qipo[n] required) Portability options for CPU2000: ------------------------------- 175.vpr: -------- -DSPEC_CPU2000_P64: Necessary to enable the P64 porting changes 176.gcc: -------- -Dalloca=_alloca : Use the built-in optimized alloca /F60000000 : 176.gcc uses alloca and this options tells the linker to pre-allocate 60MB of stack. The default amount of stack allocated is not enough and 176.gcc crashes with a run-time error 178.galgel: ----------- -FI : Fixed-format F90 source code. /F32000000 : Same as with 176.gcc, pre-allocates a 32MB stack 181.mcf: -------- -DSPEC_CPU2000_P64: Necessary to enable the P64 porting changes 186.crafty: ----------- -DNT_i386 : Specifies that it is a Windows NT Intel processor-based system which makes the compiler use "long long" as the 64-bit variable that 186.crafty needs. 252.eon: -------- -DSPEC_CPU2000_P64: Necessary to enable the P64 porting changes 253.perlbmk: ------------ -DSPEC_CPU2000_NTOS: This enables the code changes for porting to Windows get included -DPERLDLL : On Windows, we need a perl.exe instead of a perl.exe and perl.dll. This pre-define ensures that the changes necessary to get a single, UNIX-style executible without getting the indirect calls that can cause a 10% performance degradation. This allows the Windows-based executible to be as close as possible to the Unix-based one. -DSPEC_CPU2000_P64: Necessary to enable the P64 porting changes -DHAS_LONG_LONG -DUSE_LONG_LONG : Make the compiler use "long long" as the 64-bit integer variables 254.gap: -------- -DSPEC_CPU2000 : -DSPEC_CPU2000_P64: Necessary to enable the P64 porting changes 255.vortex: ----------- -DSPEC_CPU2000_P64: Necessary to enable the P64 porting changes