SPEC® MPIL2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

Hewlett Packard Enterprise

SGI 8600
(Intel Xeon Gold 6148, 2.40 GHz)

MPI2007 license: 1 Test date: Oct-2017
Test sponsor: HPE Hardware Availability: Jul-2017
Tested by: HPE Software Availability: Nov-2017
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
121.pop2 4352 22.3 175   20.9 186   21.1 185   4096 21.0 186   20.5 190   20.3 192  
122.tachyon 4352 28.2 69.0 27.9 69.6 28.2 69.0 4352 28.2 69.0 27.9 69.6 28.2 69.0
125.RAxML 4352 47.2 61.9 47.0 62.1 47.3 61.7 5120 42.1 69.3 42.0 69.5 42.3 68.9
126.lammps 4352 16.6 148   16.3 151   16.0 154   5120 14.7 168   15.1 163   14.8 166  
128.GAPgeofem 4352 48.9 121   48.9 121   48.6 122   4352 48.9 121   48.9 121   48.6 122  
129.tera_tf 4352 20.5 53.7 20.4 54.0 20.2 54.5 4096 20.4 53.9 20.0 54.9 19.9 55.2
132.zeusmp2 4352 23.4 90.5 23.8 89.0 23.7 89.4 2048 20.4 104   20.8 102   20.7 102  
137.lu 4352 19.0 221   19.2 219   19.1 220   2048 15.7 268   15.6 270   15.4 273  
142.dmilc 4352 13.0 283   13.2 280   13.0 283   4352 13.0 283   13.2 280   13.0 283  
143.dleslie 4352 12.6 247   12.6 246   12.6 247   4864 11.5 269   11.8 262   11.6 267  
145.lGemsFDTD 4352 37.0 119   37.2 119   36.9 119   2048 30.3 146   30.3 146   30.4 145  
147.l2wrf2 4352 33.3 246   33.7 244   33.2 247   5120 31.8 258   31.8 258   31.9 257  
Hardware Summary
Type of System: Homogeneous
Compute Node: HPE XA730i Gen10 Server Node
Interconnect: InfiniBand (MPI and I/O)
File Server Node: Lustre FS
Total Compute Nodes: 128
Total Chips: 256
Total Cores: 5120
Total Threads: 10240
Total Memory: 24 TB
Base Ranks Run: 4352
Minimum Peak Ranks: 2048
Maximum Peak Ranks: 5120
Software Summary
C Compiler: Intel C Composer XE for Linux,
Version 18.0.0.128 Build 20170811
C++ Compiler: Intel C++ Composer XE for Linux,
Version 18.0.0.128 Build 20170811
Fortran Compiler: Intel Fortran Composer XE for Linux,
Version 18.0.0.128 Build 20170811
Base Pointers: 64-bit
Peak Pointers: 64-bit
MPI Library: HPE Performance Software - Message Passing
Interface 2.17
Other MPI Info: OFED 3.2.2
Pre-processors: None
Other Software: None

Node Description: HPE XA730i Gen10 Server Node

Hardware
Number of nodes: 128
Uses of the node: compute
Vendor: Hewlett Packard Enterprise
Model: SGI 8600 (Intel Xeon Gold 6148, 2.40 GHz)
CPU Name: Intel Xeon Gold 6148
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 40
Cores per chip: 20
Threads per core: 2
CPU Characteristics: Intel Turbo Boost Technology up to 3.70 GHz
CPU MHz: 2400
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 1 MB I+D on chip per core
L3 Cache: 27.5 MB I+D on chip per chip
Other Cache: None
Memory: 192 GB (12 x 16 GB 2Rx4 PC4-2666V-R)
Disk Subsystem: None
Other Hardware: None
Adapter: Mellanox MT27700 with ConnectX-4 ASIC
Number of Adapters: 2
Slot Type: PCIe x16 Gen3 8GT/s
Data Rate: InfiniBand 4X EDR
Ports Used: 1
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT27700 with ConnectX-4 ASIC
Adapter Driver: OFED-3.4-2.1.8.0
Adapter Firmware: 12.18.1000
Operating System: Red Hat Enterprise Linux Server 7.3 (Maipo),
Kernel 3.10.0-514.2.2.el7.x86_64
Local File System: LFS
Shared File System: LFS
System State: Multi-user, run level 3
Other Software: SGI Management Center Compute Node 3.5.0,
Build 716r171.rhel73-1705051353

Node Description: Lustre FS

Hardware
Number of nodes: 4
Uses of the node: fileserver
Vendor: Hewlett Packard Enterprise
Model: Rackable C1104-GP2 (Intel Xeon E5-2690 v3, 2.60
GHz)
CPU Name: Intel Xeon E5-2690 v3
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 24
Cores per chip: 12
Threads per core: 1
CPU Characteristics: Intel Turbo Boost Technology up to 3.50 GHz
Hyper-Threading Technology disabled
CPU MHz: 2600
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 30 MB I+D on chip per chip
Other Cache: None
Memory: 128 GB (8 x 16 GB 2Rx4 PC4-2133P-R)
Disk Subsystem: 684 TB RAID 6
48 x 8+2 2TB 7200 RPM
Other Hardware: None
Adapter: Mellanox MT27700 with ConnectX-4 ASIC
Number of Adapters: 2
Slot Type: PCIe x16 Gen3
Data Rate: InfiniBand 4X EDR
Ports Used: 1
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT27700 with ConnectX-4 ASIC
Adapter Driver: OFED-3.3-1.0.0.0
Adapter Firmware: 12.14.2036
Operating System: Red Hat Enterprise Linux Server 7.3 (Maipo),
Kernel 3.10.0-514.2.2.el7.x86_64
Local File System: ext3
Shared File System: LFS
System State: Multi-user, run level 3
Other Software: None

Interconnect Description: InfiniBand (MPI and I/O)

Hardware
Vendor: Mellanox Technologies and SGI
Model: SGI P0002145
Switch Model: SGI P0002145
Number of Switches: 30
Number of Ports: 36
Data Rate: InfiniBand 4X EDR
Firmware: 11.0350.0394
Topology: Enhanced Hypercube
Primary Use: MPI and I/O traffic

Base Tuning Notes

src.alt used: 143.dleslie->integer_overflow

Submit Notes

The config file option 'submit' was used.

General Notes



 Software environment:
   export MPI_REQUEST_MAX=65536
   export MPI_TYPE_MAX=32768
   export MPI_IB_RAILS=2
   export MPI_IB_IMM_UPGRADE=false
   export MPI_CONNECTIONS_THRESHOLD=0
   export MPI_IB_DCIS=2
   export MPI_IB_HYPER_LAZY=false
   ulimit -s unlimited

 BIOS settings:
   AMI BIOS version SAED7177, 07/17/2017

 Job Placement:
   Each MPI job was assigned to a topologically compact set
   of nodes.

 Additional notes regarding interconnect:
   The Infiniband network consists of two independent planes,
   with half the switches in the system allocated to each plane.
   I/O traffic is restricted to one plane, while MPI traffic can
   use both planes.

Base Compiler Invocation

C benchmarks:

 icc 

C++ benchmarks:

126.lammps:  icpc 

Fortran benchmarks:

 ifort 

Benchmarks using both Fortran and C:

 icc   ifort 

Base Portability Flags

121.pop2:  -DSPEC_MPI_CASE_FLAG 

Base Optimization Flags

C benchmarks:

 -O3   -xCORE-AVX512   -no-prec-div   -ipo 

C++ benchmarks:

126.lammps:  -O3   -xCORE-AVX512   -no-prec-div   -ansi-alias   -ipo 

Fortran benchmarks:

 -O3   -xCORE-AVX512   -no-prec-div   -ipo 

Benchmarks using both Fortran and C:

 -O3   -xCORE-AVX512   -no-prec-div   -ipo 

Base Other Flags

C benchmarks:

 -lmpi 

C++ benchmarks:

126.lammps:  -lmpi 

Fortran benchmarks:

 -lmpi 

Benchmarks using both Fortran and C:

 -lmpi 

Peak Compiler Invocation

C benchmarks (except as noted below):

 icc 
125.RAxML:  /sw/sdev/intel/parallel_studio_xe_2017_update4/compilers_and_libraries_2017.4.196/linux/bin/intel64/icc 

C++ benchmarks:

126.lammps:  icpc 

Fortran benchmarks (except as noted below):

 ifort 
143.dleslie:  /sw/sdev/intel/parallel_studio_xe_2017_update4/compilers_and_libraries_2017.4.196/linux/bin/intel64/ifort 

Benchmarks using both Fortran and C:

 icc   ifort 

Peak Portability Flags

Same as Base Portability Flags

Peak Optimization Flags

C benchmarks:

122.tachyon:  basepeak = yes 
125.RAxML:  -O3   -xCORE-AVX512   -no-prec-div   -ipo 
142.dmilc:  basepeak = yes 

C++ benchmarks:

126.lammps:  -O3   -xCORE-AVX512   -no-prec-div   -ansi-alias   -ipo 

Fortran benchmarks:

 -O3   -xCORE-AVX512   -no-prec-div   -ipo 

Benchmarks using both Fortran and C:

121.pop2:  -O3   -xCORE-AVX512   -no-prec-div   -ipo 
128.GAPgeofem:  basepeak = yes 
132.zeusmp2:  Same as 121.pop2 
147.l2wrf2:  Same as 121.pop2 

Peak Other Flags

Same as Base Other Flags


The flags file that was used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/HPE_x86_64_Intel18_flags.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/mpi2007/flags/HPE_x86_64_Intel18_flags.xml.