CPU2006 Result Flag Description

Base Optimization Flags

C benchmarks

- -xSSE4.1
- COPTIMIZE
- Code is optimized for Intel(R) processors with support for SSE 4.1i instructions. The resulting code may contain unconditional use of features that are not supported on other processors. This option also enables new optimizations in addition to Intel processor-specific optimizations including advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors.
  
  Do not use this option if you are executing a program on a processor that is not an Intel processor. If you use this option on a non-compatible processor to compile the main program (in Fortran) or the function main() in C/C++, the program will display a fatal run-time error if they are executed on unsupported processors.
- -ipo
- COPTIMIZE
- Multi-file ip optimizations that includes:
  - inline function expansion
  - interprocedural constant propogation
  - dead code elimination
  - propagation of function characteristics
  - passing arguments in registers
  - loop-invariant code motion
- -O3
- COPTIMIZE
- Enables O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations. Enables optimizations for maximum speed, such as:
  - Loop unrolling, including instruction scheduling
  - Code replication to eliminate branches
  - Padding the size of certain power-of-two arrays to allow more efficient cache use.
  On IA-32 and Intel EM64T processors, when O3 is used with options -ax or -x (Linux) or with options /Qax or /Qx (Windows), the compiler performs more aggressive data dependency analysis than for O2, which may result in longer compilation times. The O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to O2 optimizations. The O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.
- Includes:
  - -O2
    - -O1
      
      -funroll-loops
      
      -fno-builtin
      
      -mno-ieee-fp
      
      -fomit-framepointer
      
      -ffunction-sections
      
      -ftz
- -no-prec-div
- COPTIMIZE
- (disable/enable[default] -prec-div)
  -no-prec-div enables optimizations that give slightly less precise results than full IEEE division.
  
  When you specify -no-prec-div along with some optimizations, such as -xN and -xB (Linux) or /QxN and /QxB (Windows), the compiler may change floating-point division computations into multiplication by the reciprocal of the denominator. For example, A/B is computed as A * (1/B) to improve the speed of the computation.
  
  However, sometimes the value produced by this transformation is not as accurate as full IEEE division. When it is important to have fully precise IEEE division, do not use -no-prec-div. This will enable the default -prec-div and the result will be more accurate, with some loss of performance.
- -static
- COPTIMIZE
- Compiler option to statically link in libraries at link time
- -inline-calloc
- COPTIMIZE
- Directs the compiler to inline calloc() calls as malloc()/memset()
- -opt-malloc-options=3
- COPTIMIZE
- The compiler adds setup code in the C/C++/Fortran main function to enable optimal malloc algorithms:
  - n=0: Default, no changes to the malloc options. No call to mallopt() is made.
  - n=1: M_MMAP_MAX=2 and M_TRIM_THRESHOLD=0x10000000. Call mallopt with the two settings.
  - n=2: M_MMAP_MAX=2 and M_TRIM_THRESHOLD=0x40000000. Call mallopt with these two settings.
  - n=3: M_MMAP_MAX=0 and M_TRIM_THRESHOLD=-1. Call mallopt with these two settings. This will cause use of sbrk() calls instead of mmap() calls to get memory from the system.
  The two parameters, M_MMAP_MAX and M_TRIM_THRESHOLD, are described below
  
  Function: int mallopt (int param, int value) When calling mallopt, the param argument specifies the parameter to be set, and value the new value to be set. Possible choices for param, as defined in malloc.h, are:
  - M_TRIM_THRESHOLD This is the minimum size (in bytes) of the top-most, releasable chunk that will cause sbrk to be called with a negative argument in order to return memory to the system.
  - M_TOP_PAD This parameter determines the amount of extra memory to obtain from the system when a call to sbrk is required. It also specifies the number of bytes to retain when shrinking the heap by calling sbrk with a negative argument. This provides the necessary hysteresis in heap size such that excessive amounts of system calls can be avoided.
  - M_MMAP_THRESHOLD All chunks larger than this value are allocated outside the normal heap, using the mmap system call. This way it is guaranteed that the memory for these chunks can be returned to the system on free. Note that requests smaller than this threshold might still be allocated via mmap.
  - M_MMAP_MAX The maximum number of chunks to allocate with mmap. Setting this to zero disables all use of mmap.
- -opt-prefetch
- COPTIMIZE
- Enable/disable(DEFAULT) the compiler to generate prefetch instructions to prefetch data.

C++ benchmarks

- -xSSE4.1
- CXXOPTIMIZE
- Code is optimized for Intel(R) processors with support for SSE 4.1i instructions. The resulting code may contain unconditional use of features that are not supported on other processors. This option also enables new optimizations in addition to Intel processor-specific optimizations including advanced data layout and code restructuring optimizations to improve memory accesses for Intel processors.
  
  Do not use this option if you are executing a program on a processor that is not an Intel processor. If you use this option on a non-compatible processor to compile the main program (in Fortran) or the function main() in C/C++, the program will display a fatal run-time error if they are executed on unsupported processors.
- -ipo
- CXXOPTIMIZE
- Multi-file ip optimizations that includes:
  - inline function expansion
  - interprocedural constant propogation
  - dead code elimination
  - propagation of function characteristics
  - passing arguments in registers
  - loop-invariant code motion
- -O3
- CXXOPTIMIZE
- Enables O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations. Enables optimizations for maximum speed, such as:
  - Loop unrolling, including instruction scheduling
  - Code replication to eliminate branches
  - Padding the size of certain power-of-two arrays to allow more efficient cache use.
  On IA-32 and Intel EM64T processors, when O3 is used with options -ax or -x (Linux) or with options /Qax or /Qx (Windows), the compiler performs more aggressive data dependency analysis than for O2, which may result in longer compilation times. The O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to O2 optimizations. The O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.
- Includes:
  - -O2
    - -O1
      
      -funroll-loops
      
      -fno-builtin
      
      -mno-ieee-fp
      
      -fomit-framepointer
      
      -ffunction-sections
      
      -ftz
- -no-prec-div
- CXXOPTIMIZE
- (disable/enable[default] -prec-div)
  -no-prec-div enables optimizations that give slightly less precise results than full IEEE division.
  
  When you specify -no-prec-div along with some optimizations, such as -xN and -xB (Linux) or /QxN and /QxB (Windows), the compiler may change floating-point division computations into multiplication by the reciprocal of the denominator. For example, A/B is computed as A * (1/B) to improve the speed of the computation.
  
  However, sometimes the value produced by this transformation is not as accurate as full IEEE division. When it is important to have fully precise IEEE division, do not use -no-prec-div. This will enable the default -prec-div and the result will be more accurate, with some loss of performance.
- -opt-prefetch
- CXXOPTIMIZE
- Enable/disable(DEFAULT) the compiler to generate prefetch instructions to prefetch data.
- -Wl,-z,muldefs
- EXTRA_LDFLAGS
- Enable SmartHeap and/or other library usage by forcing the linker to ignore multiple definitions if present
- -L/spec/cpu2006.1.1/lib -lsmartheap
- EXTRA_LIBS
- MicroQuill SmartHeap Library V8.1 available from http://www.microquill.com/

Implicitly Included Flags

This section contains descriptions of flags that were included implicitly by other flags, but which do not have a permanent home at SPEC.

For questions about the meanings of these flags, please contact the tester.
For other inquiries, please contact webmaster@spec.org
Copyright 2006-2014 Standard Performance Evaluation Corporation
Tested with SPEC CPU2006 v1.1.
Report generated on Tue Jul 22 19:32:46 2014 by SPEC CPU2006 flags formatter v6906.

CPU2006 Flag Description
Fujitsu Siemens Computers PRIMERGY RX300 S4, Intel Xeon X5470, 3.33 GHz

Base Compiler Invocation

C benchmarks

C++ benchmarks

Base Portability Flags

400.perlbench

462.libquantum

483.xalancbmk

Base Optimization Flags

C benchmarks

C++ benchmarks

Base Other Flags

C benchmarks

403.gcc

Implicitly Included Flags

	Indicates that the flag description came from the user flags file.
	Indicates that the flag description came from the suite-wide flags file.
	Indicates that the flag description came from a per-benchmark flags file.

CPU2006 Flag DescriptionFujitsu Siemens Computers PRIMERGY RX300 S4, Intel Xeon X5470, 3.33 GHz

Base Compiler Invocation

Base Portability Flags

Base Optimization Flags

Base Other Flags

Implicitly Included Flags

CPU2006 Flag Description
Fujitsu Siemens Computers PRIMERGY RX300 S4, Intel Xeon X5470, 3.33 GHz