SPEC® ACCEL™ OMP Result

Copyright 2015-2017 Standard Performance Evaluation Corporation

Intel (Test Sponsor: Technische Universitaet Dresden)

Intel Xeon Phi 7210

Intel Server System LADMP00AP Family (Xeon Phi
7210, 1.3 GHz, 64 cores, 4 threads)

SPECaccel_omp_base = 4.39

SPECaccel_omp_peak = 6.08

ACCEL license: 37A Test date: Jul-2017
Test sponsor: Technische Universitaet Dresden Hardware Availability: Jun-2016
Tested by: Technische Universitaet Dresden Software Availability: Dec-2016
Benchmark results graph
Hardware
CPU Name: Intel Xeon Phi 7210
CPU Characteristics: Intel Turbo Boost 2 Technology up to 1.50 GHz
CPU MHz: 1300
CPU MHz Maximum: 1500
FPU: None
CPU(s) enabled: 64 cores, 1 chip, 64 cores/chip, 4 threads/core
CPU(s) orderable: 1 chip
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 1 MB I+D on chip per 2 cores
L3 Cache: 16 GB I+D on chip per chip
Other Cache: None
Memory: 96 GB (6 x 16 GB 2Rx4 PC4-2400T-R, running at
1066 MHz)
Disk Subsystem: 275 GB INTEL SSDSC2BB30
Other Hardware: --
Accelerator
Accel Model Name: Xeon Phi 7210
Accel Vendor: Intel
Accel Name: Intel Xeon Phi 7210
Type of Accel: CPU
Accel Connection: N/A
Does Accel Use ECC: yes
Accel Description: Intel Xeon Phi 7210, SMT ON, Turbo ON
Cluster Mode: Quadrant, Memory Mode: Cache
Accel Driver:
Software
Operating System: CentOS Linux release 7.3
3.10.0-514.21.2.el7.x86_64
Compiler: Intel Compiler C/C++/Fortran Version 17.0.1
20161005
File System: ext4
System State: Run level 3 (user-level)
Other Software: FFTW 3.3.6pl1

Results Table

Benchmark Base Peak
Seconds Ratio Seconds Ratio Seconds Ratio Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
503.postencil 43.9  2.48   44.1  2.47   43.3  2.51   35.9  3.04  38.6  2.82  36.0  3.03 
504.polbm 23.1  5.27   22.6  5.40   22.5  5.42   21.6  5.65  22.1  5.52  21.5  5.67 
514.pomriq 294    2.11   293    2.12   291    2.13   233    2.67  230    2.70  230    2.70 
550.pmd 77.9  3.09   77.5  3.11   77.4  3.12   55.0  4.38  55.1  4.38  54.9  4.39 
551.ppalm 4660    0.117  4654    0.117  4645    0.117  251    2.17  251    2.17  252    2.16 
552.pep 97.4  2.37   97.4  2.37   97.5  2.37   61.6  3.75  61.4  3.76  61.5  3.76 
553.pclvrleaf 103    11.1    102    11.2    103    11.1    113    10.1   113    10.2   113    10.1  
554.pcg 210    1.59   211    1.58   212    1.57   195    1.71  195    1.70  195    1.70 
555.pseismic 50.6  5.57   51.2  5.51   51.3  5.50   52.2  5.41  52.3  5.40  52.3  5.39 
556.psp 42.5  19.2    42.5  19.3    42.8  19.1    38.8  21.1   38.4  21.3   38.6  21.2  
557.pcsp 56.1  15.3    55.6  15.4    55.9  15.4    46.3  18.5   44.4  19.3   45.3  19.0  
559.pmniGhost 68.0  5.84   67.9  5.85   68.4  5.80   68.4  5.81  68.3  5.81  68.7  5.78 
560.pilbdc 76.6  8.53   76.8  8.50   77.7  8.41   68.1  9.58  67.8  9.63  68.8  9.49 
563.pswim 32.8  4.85   32.9  4.83   32.8  4.85   29.4  5.41  29.1  5.46  29.2  5.44 
570.pbt 30.8  25.3    30.7  25.4    30.9  25.2    25.5  30.6   25.5  30.6   25.6  30.5  

Submit Notes

The config file option 'submit' was used.
submit = numactl -p 0 $command

Platform Notes

 Sysinfo program /tmp/spec-accel/1.2/Docs/sysinfo
 $Rev: 6965 $ $Date:: 2015-04-21 #$ c05a7f14b1b1765e3fe1df68447e8a35
 running on taurusknl28.taurus.hrsk.tu-dresden.de Mon Jul 24 10:45:36 2017

 This section contains SUT (System Under Test) info as seen by
 some common utilities.  To remove or add to this section, see:
   http://www.spec.org/accel/Docs/config.html#sysinfo

 From /proc/cpuinfo
    model name : Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz
       1 "physical id"s (chips)
       256 "processors"
    cores, siblings (Caution: counting these is hw and system dependent.  The
    following excerpts from /proc/cpuinfo might not be reliable.  Use with
    caution.)
       cpu cores : 64
       siblings  : 256
       physical 0: cores 0 1 2 3 6 7 10 11 12 13 14 15 18 19 20 21 22 23 24 25 26
       27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
       52 53 56 57 58 59 60 61 62 63 64 65 68 69 70 71 72 73
    cache size : 1024 KB

 From /proc/meminfo
    MemTotal:       98707216 kB
    HugePages_Total:       0
    Hugepagesize:       2048 kB

 /usr/bin/lsb_release -d
    CentOS Linux release 7.3.1611 (Core)

 From /etc/*release* /etc/*version*
    centos-release: CentOS Linux release 7.3.1611 (Core)
    centos-release-upstream: Derived from Red Hat Enterprise Linux 7.3 (Source)
    os-release:
       NAME="CentOS Linux"
       VERSION="7 (Core)"
       ID="centos"
       ID_LIKE="rhel fedora"
       VERSION_ID="7"
       PRETTY_NAME="CentOS Linux 7 (Core)"
       ANSI_COLOR="0;31"
       CPE_NAME="cpe:/o:centos:centos:7"
    redhat-release: CentOS Linux release 7.3.1611 (Core)
    system-release: CentOS Linux release 7.3.1611 (Core)
    system-release-cpe: cpe:/o:centos:centos:7

 uname -a:
    Linux taurusknl28.taurus.hrsk.tu-dresden.de 3.10.0-514.21.2.el7.x86_64 #1 SMP
    Tue Jun 20 12:24:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

 run-level 3 Jun 30 13:46

 SPEC is set to: /tmp/spec-accel/1.2
    Filesystem     Type  Size  Used Avail Use% Mounted on
    /dev/sda1      ext4  275G  6.5G  255G   3% /
 Additional information from dmidecode:

    Warning: Use caution when you interpret this section. The 'dmidecode' program
    reads system data which is "intended to allow hardware to be accurately
    determined", but the intent may not be met, as there are frequent changes to
    hardware, firmware, and the "DMTF SMBIOS" standard.


 (End of data from sysinfo program)

General Notes

Used Environment Variables:
  ENV_KMP_AFFINITY=compact,0 - assign OpenMP Threads continously
  ENV_OMP_NUM_THREADS=128 - limits number of Threads to be started to 128
  ENV_KMP_HW_SUBSET=1S,64C,2T - control Thread distribution accross sockets, cores and hw threads
  ENV_FORT_BUFFERED=true - enables buffered I/O for Fortran
  ENV_OMP_DYNAMIC -     Enable or disable the dynamic adjustment of the number of threads within a team. If undefined, dynamic adjustment is disabled by default.
  ENV_KMP_LIBRARY - Selects the OpenMP runtime library throughput. The options for the variable value are: serial, turnaround, or throughput indicating the execution mode.
  ENV_KMP_BLOCKTIME - Sets the time, in milliseconds, that a thread should wait, after completing the execution of a parallel region, before sleeping.

Base Compiler Invocation

C benchmarks:

 icc 

Fortran benchmarks:

 ifort 

Benchmarks using both Fortran and C:

 icc   ifort 

Base Portability Flags

503.postencil:  -DSPEC_USE_INNER_SIMD 
504.polbm:  -DSPEC_USE_INNER_SIMD 
514.pomriq:  -DSPEC_USE_INNER_SIMD 
550.pmd:  -DSPEC_USE_INNER_SIMD   -80 
551.ppalm:  -DSPEC_USE_INNER_SIMD 
552.pep:  -DSPEC_USE_INNER_SIMD 
553.pclvrleaf:  -DSPEC_USE_INNER_SIMD 
554.pcg:  -DSPEC_USE_INNER_SIMD 
555.pseismic:  -DSPEC_USE_INNER_SIMD 
556.psp:  -DSPEC_USE_INNER_SIMD 
557.pcsp:  -DSPEC_USE_INNER_SIMD 
559.pmniGhost:  -DSPEC_USE_INNER_SIMD   -nofor_main 
560.pilbdc:  -DSPEC_USE_INNER_SIMD 
563.pswim:  -DSPEC_USE_INNER_SIMD 
570.pbt:  -DSPEC_USE_INNER_SIMD 

Base Optimization Flags

C benchmarks:

 -O3   -g   -qopenmp   -xMIC-AVX512   -qopenmp-offload=host 

Fortran benchmarks:

 -O3   -g   -qopenmp   -xMIC-AVX512   -qopenmp-offload=host 

Benchmarks using both Fortran and C:

 -O3   -g   -qopenmp   -xMIC-AVX512   -qopenmp-offload=host 

Peak Compiler Invocation

C benchmarks:

 icc 

Fortran benchmarks:

 ifort 

Benchmarks using both Fortran and C:

 icc   ifort 

Peak Portability Flags

503.postencil:  -DSPEC_USE_INNER_SIMD 
504.polbm:  -DSPEC_USE_INNER_SIMD 
514.pomriq:  -DSPEC_USE_INNER_SIMD 
550.pmd:  -DSPEC_USE_INNER_SIMD   -80 
551.ppalm:  -DSPEC_USE_INNER_SIMD   -DSPEC_HOST_FFTW3 
552.pep:  -DSPEC_USE_INNER_SIMD 
553.pclvrleaf:  -DSPEC_USE_INNER_SIMD 
554.pcg:  -DSPEC_USE_INNER_SIMD 
555.pseismic:  -DSPEC_USE_INNER_SIMD 
556.psp:  -DSPEC_USE_INNER_SIMD 
557.pcsp:  -DSPEC_USE_INNER_SIMD 
559.pmniGhost:  -DSPEC_USE_INNER_SIMD   -nofor_main 
560.pilbdc:  -DSPEC_USE_INNER_SIMD 
563.pswim:  -DSPEC_USE_INNER_SIMD 
570.pbt:  -DSPEC_USE_INNER_SIMD 

Peak Optimization Flags

C benchmarks:

503.postencil:  -O3   -xCORE-AVX2   -g   -qopenmp   -qopenmp-offload=host   -qopt-prefetch=3 
504.polbm:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host   -qopt-prefetch=5 
514.pomriq:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host   -qopt-prefetch=2 
552.pep:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host   -qopt-streaming-stores always 
554.pcg:  -O3   -xCORE-AVX2   -g   -qopenmp   -qopenmp-offload=host   -qopt-prefetch=2   -qopt-streaming-stores always 
557.pcsp:  Same as 504.polbm 
570.pbt:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host 

Fortran benchmarks:

550.pmd:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host   -qopt-prefetch=3   -no-prec-div   -fimf-precision=low 
551.ppalm:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host   -no-prec-sqrt   -I/sw/taurus/libraries/fftw/3.3.6pl1-gcc5.3-intelmpi5.1/include   -L/sw/taurus/libraries/fftw/3.3.6pl1-gcc5.3-intelmpi5.1/lib 
555.pseismic:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host 
556.psp:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host   -qopt-prefetch=2 
560.pilbdc:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host   -qopt-prefetch=3 
563.pswim:  Same as 555.pseismic 

Benchmarks using both Fortran and C:

553.pclvrleaf:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host   -qopt-streaming-stores always 
559.pmniGhost:  -O3   -xMIC-AVX512   -g   -qopenmp   -qopenmp-offload=host   -qopt-prefetch=3   -qopt-streaming-stores always 

Peak Other Flags

Fortran benchmarks:

551.ppalm:  -lfftw3 

The flags file that was used to format this result can be browsed at
https://www.spec.org/accel/flags/icc2015-openmp.html.

You can also download the XML flags source by saving the following link:
https://www.spec.org/accel/flags/icc2015-openmp.xml.