SPEC® CFP2006 Result

Copyright 2006-2014 Standard Performance Evaluation Corporation

Tyan (Test Sponsor: Advanced Micro Devices)

Tyan YR190-B8228,
AMD Opteron 4376 HE

CPU2006 license: 49 Test date: Oct-2012
Test sponsor: Advanced Micro Devices Hardware Availability: Dec-2012
Tested by: Advanced Micro Devices Software Availability: Aug-2012
Benchmark results graph
Hardware
CPU Name: AMD Opteron 4376 HE
CPU Characteristics: AMD Turbo CORE technology up to 3.60 GHz
CPU MHz: 2600
FPU: Integrated
CPU(s) enabled: 16 cores, 2 chips, 8 cores/chip
CPU(s) orderable: 1,2 chips
Primary Cache: 256 KB I on chip per chip,
64 KB I shared / 2 cores;
16 KB D on chip per core
Secondary Cache: 8 MB I+D on chip per chip, 2 MB shared / 2 cores
L3 Cache: 8 MB I+D on chip per chip
Other Cache: None
Memory: 64 GB (4 x 16 GB 2Rx4 PC3-12800R-11, ECC)
Disk Subsystem: 1 x 128 GB SSD
Other Hardware: None
Software
Operating System: Red Hat Enterprise Linux Server release 6.3,
Kernel 2.6.32-279.el6.x86_64
Compiler: C/C++/Fortran: Version 4.5.2 of x86 Open64
Compiler Suite(from AMD)
Auto Parallel: No
File System: ext3
System State: Run level 3 (Full multiuser with network)
Base Pointers: 64-bit
Peak Pointers: 32/64-bit
Other Software: None

Results Table

Benchmark Base Peak
Copies Seconds Ratio Seconds Ratio Seconds Ratio Copies Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
410.bwaves 16 1411 154   1403 155   1400 155   16 1373 158 1381 158 1373 158
416.gamess 16 1747 179   1750 179   1731 181   16 1586 198 1605 195 1590 197
433.milc 16 1115 132   1116 132   1116 132   16 966 152 965 152 966 152
434.zeusmp 16 672 217   677 215   674 216   16 659 221 659 221 654 223
435.gromacs 16 523 218   523 219   523 219   16 421 272 422 271 420 272
436.cactusADM 16 762 251   766 250   765 250   16 692 276 696 275 694 276
437.leslie3d 16 1397 108   1396 108   1398 108   16 1075 140 1073 140 1073 140
444.namd 16 710 181   713 180   714 180   16 616 208 612 210 615 209
447.dealII 16 482 380   485 378   482 380   16 434 422 441 415 427 429
450.soplex 16 1029 130   1030 130   1029 130   16 943 141 942 142 944 141
453.povray 16 352 242   352 242   352 242   16 309 276 308 277 308 276
454.calculix 16 380 347   380 348   380 348   16 365 361 365 362 365 362
459.GemsFDTD 16 1708 99.4 1708 99.4 1710 99.3 16 1488 114 1490 114 1491 114
465.tonto 16 759 207   758 208   758 208   16 692 227 684 230 693 227
470.lbm 16 1031 213   1030 213   1030 213   16 1031 213 1030 213 1030 213
481.wrf 16 927 193   930 192   928 193   16 927 193 930 192 928 193
482.sphinx3 16 1861 168   1857 168   1861 168   16 1371 227 1401 223 1376 227

Submit Notes

The config file option 'submit' was used.
'numactl' was used to bind copies to the cores.
See the configuration file for details.

Operating System Notes

'ulimit -s unlimited' was used to set environment stack size
'ulimit -l 2097152'  was used to set environment locked pages in memory limit

Set transparent_hugepage=never as a boot parameter in /boot/grub/menu.lst

Set vm/nr_hugepages=14336 in /etc/sysctl.conf
mount -t hugetlbfs nodev /mnt/hugepages

General Notes

Environment variables set by runspec before the start of the run:
HUGETLB_LIMIT = "896"
LD_LIBRARY_PATH = "/root/work/cpu2006v1.2/amd1206-rate-libs-revA/32:/root/work/cpu2006v1.2/amd1206-rate-libs-revA/64"

The x86 Open64 Compiler Suite is only available from (and supported by) AMD at
http://developer.amd.com/cpu/open64

Binaries were compiled on a system with 2x AMD Opteron 6386SE chips + 128GB Memory using RHEL 6.3

Base Compiler Invocation

C benchmarks:

 opencc 

C++ benchmarks:

 openCC 

Fortran benchmarks:

 openf95 

Benchmarks using both Fortran and C:

 opencc   openf95 

Base Portability Flags

410.bwaves:  -DSPEC_CPU_LP64 
416.gamess:  -DSPEC_CPU_LP64 
433.milc:  -DSPEC_CPU_LP64 
434.zeusmp:  -DSPEC_CPU_LP64 
435.gromacs:  -DSPEC_CPU_LP64 
436.cactusADM:  -DSPEC_CPU_LP64   -fno-second-underscore 
437.leslie3d:  -DSPEC_CPU_LP64 
444.namd:  -DSPEC_CPU_LP64 
447.dealII:  -DSPEC_CPU_LP64 
450.soplex:  -DSPEC_CPU_LP64 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -DSPEC_CPU_LP64 
459.GemsFDTD:  -DSPEC_CPU_LP64 
465.tonto:  -DSPEC_CPU_LP64 
470.lbm:  -DSPEC_CPU_LP64 
481.wrf:  -DSPEC_CPU_LINUX   -DSPEC_CPU_CASE_FLAG   -DSPEC_CPU_LP64   -fno-second-underscore 
482.sphinx3:  -DSPEC_CPU_LP64 

Base Optimization Flags

C benchmarks:

 -Ofast   -OPT:malloc_alg=1   -HP:bd=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -mso   -march=bdver1 

C++ benchmarks:

 -Ofast   -static   -CG:load_exe=0   -OPT:malloc_alg=1   -INLINE:aggressive=on   -HP:bd=2m:heap=2m   -D__OPEN64_FAST_SET   -march=bdver1 

Fortran benchmarks:

 -Ofast   -LNO:blocking=off   -LNO:simd_peel_align=on   -OPT:rsqrt=2   -OPT:unroll_size=256   -HP:bd=2m:heap=2m   -mso   -march=bdver1 

Benchmarks using both Fortran and C:

 -Ofast   -OPT:malloc_alg=1   -HP:bd=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -mso   -march=bdver1   -LNO:blocking=off   -LNO:simd_peel_align=on   -OPT:rsqrt=2   -OPT:unroll_size=256 

Peak Compiler Invocation

C benchmarks:

 opencc 

C++ benchmarks:

 openCC 

Fortran benchmarks:

 openf95 

Benchmarks using both Fortran and C:

 opencc   openf95 

Peak Portability Flags

410.bwaves:  -DSPEC_CPU_LP64 
416.gamess:  -DSPEC_CPU_LP64 
433.milc:  -DSPEC_CPU_LP64 
434.zeusmp:  -DSPEC_CPU_LP64 
435.gromacs:  -DSPEC_CPU_LP64 
436.cactusADM:  -DSPEC_CPU_LP64   -fno-second-underscore 
437.leslie3d:  -DSPEC_CPU_LP64 
444.namd:  -DSPEC_CPU_LP64 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -DSPEC_CPU_LP64 
459.GemsFDTD:  -DSPEC_CPU_LP64 
465.tonto:  -DSPEC_CPU_LP64 
470.lbm:  -DSPEC_CPU_LP64 
481.wrf:  -DSPEC_CPU_LINUX   -DSPEC_CPU_CASE_FLAG   -DSPEC_CPU_LP64   -fno-second-underscore 

Peak Optimization Flags

C benchmarks:

433.milc:  -Ofast   -CG:movnti=1   -CG:locs_best=on   -HP:bdt=2m:heap=2m   -IPA:plimit=7000   -IPA:callee_limit=1200   -OPT:struct_array_copy=2   -OPT:alias=field_sensitive   -mso   -march=bdver1 
470.lbm:  basepeak = yes 
482.sphinx3:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -m32   -IPA:plimit=1000   -OPT:malloc_alg=2   -CG:cmp_peep=on   -CG:p2align=0   -CG:load_exe=1   -CG:dsched=on   -INLINE:aggressive=on   -LNO:prefetch=2   -LNO:prefetch_ahead=4   -mso   -march=bdver2 

C++ benchmarks:

444.namd:  -Ofast   -IPA:plimit=3000   -LNO:ignore_feedback=off   -CG:local_sched_alg=0   -CG:load_exe=0   -OPT:unroll_size=256   -fno-exceptions   -HP:bdt=2m:heap=2m   -LNO:if_select_conv=1   -OPT:alias=disjoint   -LNO:psimd_iso_unroll=ON   -march=bdver1 
447.dealII:  -Ofast   -D__OPEN64_FAST_SET   -static   -INLINE:aggressive=on   -LNO:opt=1   -LNO:simd=2   -fno-emit-exceptions   -m32   -OPT:unroll_times_max=8   -OPT:unroll_size=256   -OPT:unroll_level=2   -HP:bdt=2m:heap=2m   -GRA:unspill=on   -CG:cmp_peep=on   -CG:movext_icmp=off   -TENV:frame_pointer=off   -march=bdver1 
450.soplex:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -O3   -LNO:ignore_feedback=off   -INLINE:aggressive=on   -OPT:RO=1   -OPT:IEEE_arith=3   -OPT:IEEE_NaN_Inf=off   -OPT:fold_unsigned_relops=on   -fno-exceptions   -CG:p2align=0   -m32   -mno-fma4   -HP:bdt=2m:heap=2m   -WOPT:sib=on   -march=bdver1 
453.povray:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -CG:pre_local_sched=off   -CG:p2align=0   -CG:p2align_split=on   -CG:dsched=on   -INLINE:aggressive=on   -HP:bd=2m:heap=2m   -OPT:transform=2   -OPT:alias=disjoint   -WOPT:aggcm=0   -march=bdver2 

Fortran benchmarks:

410.bwaves:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -OPT:Ofast   -OPT:treeheight=on   -LNO:blocking=off   -LNO:ignore_feedback=off   -LNO:fu=4   -LNO:loop_model_simd=on   -LNO:simd_rm_unity_remainder=on   -WOPT:aggstr=0   -HP:bdt=2m:heap=2m   -CG:cmp_peep=on   -march=bdver1 
416.gamess:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:fu=6   -LNO:blocking=0   -LNO:simd=2   -OPT:ro=3   -OPT:recip=on   -CG:local_sched_alg=1   -HP:bdt=2m:heap=2m   -WOPT:sib=on   -march=bdver1 
434.zeusmp:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:blocking=off   -LNO:interchange=off   -IPA:plimit=1500   -HP:bdt=2m:heap=2m   -march=bdver1 
437.leslie3d:  -Ofast   -CG:pre_minreg_level=2   -LNO:simd=0   -LNO:fusion=2   -HP:bdt=2m:heap=2m   -mso   -march=bdver1 
459.GemsFDTD:  -Ofast   -IPA:plimit=1500   -OPT:unroll_size=1024   -OPT:unroll_times_max=16   -LNO:fission=2   -CG:local_sched_alg=2   -HP   -march=bdver1 
465.tonto:  -Ofast   -OPT:alias=no_f90_pointer_alias   -LNO:blocking=off   -CG:load_exe=1   -CG:local_sched_alg=3   -IPA:plimit=525   -HP:bdt=2m:heap=2m   -march=bdver1 

Benchmarks using both Fortran and C:

435.gromacs:  -Ofast   -OPT:rsqrt=2   -HP:bdt=2m:heap=2m   -CG:local_sched_alg=2   -CG:load_exe=3   -GRA:unspill=on   -march=bdver1   -LNO:simd=3 
436.cactusADM:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:blocking=off   -LNO:prefetch=2   -LNO:pf2=0   -LNO:prefetch_ahead=4   -HP   -CG:locs_shallow_depth=1   -CG:load_exe=0   -CG:dsched=on   -WOPT:sib=on   -march=bdver1 
454.calculix:  -Ofast   -OPT:unroll_size=256   -OPT:alias=disjoint   -GRA:optimize_boundary=on   -CG:dsched=on   -HP:bdt=2m:heap=2m   -march=bdver1 
481.wrf:  basepeak = yes 

The flags file that was used to format this result can be browsed at
http://www.spec.org/cpu2006/flags/x86-open64-452-flags-rate-revA-II.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/cpu2006/flags/x86-open64-452-flags-rate-revA-II.xml.