SPEC® CFP2006 Result

Copyright 2006-2017 Standard Performance Evaluation Corporation

Hewlett Packard Enterprise (Test Sponsor: HPE)

ProLiant DL385 Gen10
(2.20 GHz, AMD EPYC 7601)

CPU2006 license: 3 Test date: Oct-2017
Test sponsor: HPE Hardware Availability: Nov-2017
Tested by: HPE Software Availability: Sep-2017
Benchmark results graph
Hardware
CPU Name: AMD EPYC 7601
CPU Characteristics: AMD Turbo CORE technology up to 3.20 GHz
CPU MHz: 2200
FPU: Integrated
CPU(s) enabled: 64 cores, 2 chips, 32 cores/chip, 2 threads/core
CPU(s) orderable: 1, 2 chip(s)
Primary Cache: 64 KB I + 32 KB D on chip per core
Secondary Cache: 512 KB I+D on chip per core
L3 Cache: 64 MB I+D on chip per chip, 8 MB shared / 4 cores
Other Cache: None
Memory: 1 TB (16 x 64 GB 4Rx4 PC4-2666V-L)
Disk Subsystem: 1 x 300 GB 15 K RPM SAS, RAID 0
Other Hardware: None
Software
Operating System: SUSE Linux Enterprise Server 12 (x86_64) SP3
Kernel 4.4.73-5-default
Compiler: C/C++/Fortran: Version 4.5.2.1 of x86 Open64
Compiler Suite (from AMD)
Auto Parallel: No
File System: ext4
System State: Run level 3 (multi-user)
Base Pointers: 64-bit
Peak Pointers: 32/64-bit
Other Software: None

Results Table

Benchmark Base Peak
Copies Seconds Ratio Seconds Ratio Seconds Ratio Copies Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
410.bwaves 128 1305 1330 1306 1330 1305 1330 64 615 1410 615 1410 616 1410
416.gamess 128 1131 2220 1136 2210 1137 2200 128 1024 2450 1026 2440 1024 2450
433.milc 128 1027 1140 1028 1140 1028 1140 64 415 1420 415 1420 415 1420
434.zeusmp 128 483 2410 483 2410 483 2410 128 479 2430 477 2440 480 2420
435.gromacs 128 470 1940 462 1980 439 2080 128 313 2920 315 2900 312 2920
436.cactusADM 128 589 2600 590 2590 588 2600 64 282 2710 283 2700 278 2750
437.leslie3d 128 1234 975 1237 973 1235 974 64 452 1330 451 1330 453 1330
444.namd 128 527 1950 527 1950 526 1950 128 441 2330 441 2330 444 2310
447.dealII 128 413 3540 419 3500 421 3480 128 366 4000 379 3860 365 4020
450.soplex 128 1031 1040 1030 1040 1030 1040 64 435 1230 436 1220 436 1220
453.povray 128 252 2710 253 2700 252 2700 128 202 3380 203 3360 202 3370
454.calculix 128 358 2950 359 2940 357 2960 128 386 2740 390 2700 379 2790
459.GemsFDTD 128 1559 871 1561 870 1559 871 64 688 987 686 990 688 987
465.tonto 128 604 2090 608 2070 602 2090 64 273 2310 273 2310 273 2310
470.lbm 128 898 1960 899 1960 897 1960 64 433 2030 431 2040 432 2040
481.wrf 128 850 1680 849 1680 850 1680 64 411 1740 412 1740 411 1740
482.sphinx3 128 1770 1410 1771 1410 1770 1410 64 634 1970 628 1990 630 1980

Submit Notes

The config file option 'submit' was used.
'numactl' was used to bind copies to the cores.
See the configuration file for details.

Operating System Notes

'ulimit -s unlimited' was used to set environment stack size
'ulimit -l 2097152' was used to set environment locked pages in memory limit

runspec command invoked through numactl i.e.:
 numactl --interleave=all runspec <etc>

Set dirty_ratio=8 to limit dirty cache to 8% of memory
Set swappiness=1 to swap only if necessary
Set zone_reclaim_mode=1 to free local node memory and avoid remote memory
sync then drop_caches=3 to reset caches before invoking runcpu
Linux governor set to performance with cpupower "cpupower frequency-set -r -g performance"

Transparent huge pages were enabled for this run (OS default)

Set vm/nr_hugepages=114688 in /etc/sysctl.conf
mount -t hugetlbfs nodev /mnt/hugepages

Platform Notes

 BIOS Configuration:
  Thermal Configuration set to Maximum Cooling
  Performance Determinism set to Power Deterministic
  Processor Power and Utilization Monitoring set to Disabled
  Workload Profile set to General Throughput Compute
   Minimum Processor Idle Power Core C-State set to C6 State

General Notes

Environment variables set by runspec before the start of the run:
HUGETLB_LIMIT = "896"
LD_LIBRARY_PATH = "/home/cpu2006/amd1603-rate-libs-revB/32:/home/cpu2006/amd1603-rate-libs-revB/64"

The binaries were built with the AMD supported x86 Open64 Compiler Suite,
which is only available from AMD at
http://developer.amd.com/tools-and-sdks/cpu-development/x86-open64-compiler-suite/

Base Compiler Invocation

C benchmarks:

 opencc 

C++ benchmarks:

 openCC 

Fortran benchmarks:

 openf95 

Benchmarks using both Fortran and C:

 opencc   openf95 

Base Portability Flags

410.bwaves:  -DSPEC_CPU_LP64 
416.gamess:  -DSPEC_CPU_LP64 
433.milc:  -DSPEC_CPU_LP64 
434.zeusmp:  -DSPEC_CPU_LP64 
435.gromacs:  -DSPEC_CPU_LP64 
436.cactusADM:  -DSPEC_CPU_LP64   -fno-second-underscore 
437.leslie3d:  -DSPEC_CPU_LP64 
444.namd:  -DSPEC_CPU_LP64 
447.dealII:  -DSPEC_CPU_LP64 
450.soplex:  -DSPEC_CPU_LP64 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -DSPEC_CPU_LP64 
459.GemsFDTD:  -DSPEC_CPU_LP64 
465.tonto:  -DSPEC_CPU_LP64 
470.lbm:  -DSPEC_CPU_LP64 
481.wrf:  -DSPEC_CPU_LINUX   -DSPEC_CPU_CASE_FLAG   -DSPEC_CPU_LP64   -fno-second-underscore 
482.sphinx3:  -DSPEC_CPU_LP64 

Base Optimization Flags

C benchmarks:

 -Ofast   -OPT:malloc_alg=1   -HP:bd=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

C++ benchmarks:

 -Ofast   -static   -CG:load_exe=0   -OPT:malloc_alg=1   -INLINE:aggressive=on   -HP:bd=2m:heap=2m   -D__OPEN64_FAST_SET   -march=bdver2   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

Fortran benchmarks:

 -Ofast   -LNO:blocking=off   -LNO:simd_peel_align=on   -OPT:rsqrt=2   -OPT:unroll_size=256   -HP:bd=2m:heap=2m   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

Benchmarks using both Fortran and C:

 -Ofast   -OPT:malloc_alg=1   -HP:bd=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs   -LNO:blocking=off   -LNO:simd_peel_align=on   -OPT:rsqrt=2   -OPT:unroll_size=256 

Peak Compiler Invocation

C benchmarks:

 opencc 

C++ benchmarks:

 openCC 

Fortran benchmarks:

 openf95 

Benchmarks using both Fortran and C:

 opencc   openf95 

Peak Portability Flags

410.bwaves:  -DSPEC_CPU_LP64 
416.gamess:  -DSPEC_CPU_LP64 
433.milc:  -DSPEC_CPU_LP64 
434.zeusmp:  -DSPEC_CPU_LP64 
435.gromacs:  -DSPEC_CPU_LP64 
436.cactusADM:  -DSPEC_CPU_LP64   -fno-second-underscore 
437.leslie3d:  -DSPEC_CPU_LP64 
444.namd:  -DSPEC_CPU_LP64 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -DSPEC_CPU_LP64 
459.GemsFDTD:  -DSPEC_CPU_LP64 
465.tonto:  -DSPEC_CPU_LP64 
470.lbm:  -DSPEC_CPU_LP64 
481.wrf:  -DSPEC_CPU_LINUX   -DSPEC_CPU_CASE_FLAG   -DSPEC_CPU_LP64   -fno-second-underscore 

Peak Optimization Flags

C benchmarks:

433.milc:  -Ofast   -CG:movnti=1   -CG:locs_best=on   -HP:bdt=2m:heap=2m   -IPA:plimit=7000   -IPA:callee_limit=1200   -OPT:struct_array_copy=2   -OPT:alias=field_sensitive   -mso   -march=bdver1   -mno-fma4 
470.lbm:  -Ofast   -CG:cmp_peep=on   -OPT:keep_ext=on   -HP:bdt=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -march=bdver1   -mno-fma4   -mso 
482.sphinx3:  -Ofast   -m32   -IPA:plimit=1000   -OPT:malloc_alg=2   -CG:cmp_peep=on   -CG:p2align=0   -CG:load_exe=1   -CG:dsched=on   -INLINE:aggressive=on   -LNO:prefetch=2   -LNO:prefetch_ahead=4   -mso   -march=bdver2   -WB,   -mno-fma4   -mno-tbm   -mno-xop 

C++ benchmarks:

444.namd:  -Ofast   -IPA:plimit=3000   -LNO:ignore_feedback=off   -CG:local_sched_alg=0   -CG:load_exe=0   -OPT:unroll_size=256   -fno-exceptions   -HP:bdt=2m:heap=2m   -LNO:if_select_conv=1   -OPT:alias=disjoint   -LNO:psimd_iso_unroll=ON   -march=bdver2   -mno-fma4   -WB,   -mno-xop   -mno-tbm 
447.dealII:  -Ofast   -D__OPEN64_FAST_SET   -static   -INLINE:aggressive=on   -LNO:opt=1   -LNO:simd=2   -fno-emit-exceptions   -m32   -OPT:unroll_times_max=8   -OPT:unroll_size=256   -OPT:unroll_level=2   -HP:bdt=2m:heap=2m   -GRA:unspill=on   -CG:cmp_peep=on   -CG:movext_icmp=off   -TENV:frame_pointer=off   -march=bdver1   -mno-fma4 
450.soplex:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -O3   -LNO:ignore_feedback=off   -INLINE:aggressive=on   -OPT:RO=1   -OPT:IEEE_arith=3   -OPT:IEEE_NaN_Inf=off   -OPT:fold_unsigned_relops=on   -fno-exceptions   -CG:p2align=0   -m32   -mno-fma4   -HP:bdt=2m:heap=2m   -WOPT:sib=on   -march=bdver1 
453.povray:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -CG:pre_local_sched=off   -CG:p2align=0   -CG:p2align_split=on   -CG:dsched=on   -INLINE:aggressive=on   -HP:bd=2m:heap=2m   -OPT:transform=2   -OPT:alias=disjoint   -WOPT:aggcm=0   -march=bdver2   -mno-fma4   -WB,   -mno-xop   -mno-tbm   -Wl,   -z,muldefs 

Fortran benchmarks:

410.bwaves:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -OPT:Ofast   -OPT:treeheight=on   -LNO:blocking=off   -LNO:ignore_feedback=off   -LNO:fu=4   -LNO:loop_model_simd=on   -LNO:simd_rm_unity_remainder=on   -WOPT:aggstr=0   -HP:bdt=2m:heap=2m   -CG:cmp_peep=on   -march=bdver2   -mno-fma4 
416.gamess:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:fu=6   -LNO:blocking=0   -LNO:simd=2   -OPT:ro=3   -OPT:recip=on   -CG:local_sched_alg=1   -HP:bdt=2m:heap=2m   -WOPT:sib=on   -march=bdver1   -mno-fma4 
434.zeusmp:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:blocking=off   -LNO:interchange=off   -IPA:plimit=1500   -HP:bdt=2m:heap=2m   -march=bdver2   -mno-fma4 
437.leslie3d:  -Ofast   -CG:pre_minreg_level=2   -LNO:simd=0   -LNO:fusion=2   -HP:bdt=2m:heap=2m   -mso   -march=bdver1   -mno-fma4 
459.GemsFDTD:  -Ofast   -IPA:plimit=1500   -OPT:unroll_size=1024   -OPT:unroll_times_max=16   -LNO:fission=2   -CG:local_sched_alg=2   -HP   -march=bdver1   -mno-fma4 
465.tonto:  -Ofast   -OPT:alias=no_f90_pointer_alias   -LNO:blocking=off   -CG:load_exe=1   -CG:local_sched_alg=3   -IPA:plimit=525   -HP:bdt=2m:heap=2m   -march=bdver2   -WB,   -mno-fma4   -mno-tbm   -mno-xop 

Benchmarks using both Fortran and C:

435.gromacs:  -Ofast   -OPT:rsqrt=2   -HP:bdt=2m:heap=2m   -CG:local_sched_alg=2   -CG:load_exe=3   -GRA:unspill=on   -march=bdver2   -mno-fma4   -LNO:simd=3 
436.cactusADM:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:blocking=off   -LNO:prefetch=2   -LNO:pf2=0   -LNO:prefetch_ahead=4   -HP   -CG:locs_shallow_depth=1   -CG:load_exe=0   -CG:dsched=on   -WOPT:sib=on   -march=bdver2   -mno-fma4 
454.calculix:  -Ofast   -OPT:unroll_size=256   -OPT:alias=disjoint   -GRA:optimize_boundary=on   -CG:dsched=on   -HP:bdt=2m:heap=2m   -march=bdver1   -mno-fma4 
481.wrf:  -Ofast   -LNO:blocking=off   -LANG:copyinout=off   -IPA:callee_limit=5000   -GRA:prioritize_by_density=on   -HP   -WOPT:sib=on   -march=bdver1   -mno-fma4 

The flags files that were used to format this result can be browsed at
http://www.spec.org/cpu2006/flags/x86-openflags-rate-revA-I.html,
http://www.spec.org/cpu2006/flags/HPE-Platform-Flags-AMD-V1.2-EPYC-revB.html.

You can also download the XML flags sources by saving the following links:
http://www.spec.org/cpu2006/flags/x86-openflags-rate-revA-I.xml,
http://www.spec.org/cpu2006/flags/HPE-Platform-Flags-AMD-V1.2-EPYC-revB.xml.