SPEChpc™ 2021 Large Result

IBM (Test Sponsor: Oak Ridge National Laboratory)

Summit: IBM Power System AC922 (IBM Power9, Tesla V100-SXM2-16GB)

SPEChpc 2021_lrg_base = 41.00

SPEChpc 2021_lrg_peak = Not Run

hpc2021 License:	056A	Test Date:	Sep-2021
Test Sponsor:	Oak Ridge National Laboratory	Hardware Availability:	Nov-2018
Tested by:	Oak Ridge National Laboratory	Software Availability:	Jul-2021

Benchmark result graphs are available in the PDF report.

Results Table

Benchmark	Base									Peak
Benchmark	Model	Ranks	Thrds/Rnk	Seconds	Ratio	Seconds	Ratio	Seconds	Ratio	Model	Ranks	Thrds/Rnk	Seconds	Ratio	Seconds	Ratio	Seconds	Ratio
SPEChpc 2021_lrg_base					41.00
SPEChpc 2021_lrg_peak					Not Run
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
805.lbm_l	ACC	8400	1	38.6	70.5	27.0	1010
818.tealeaf_l	ACC	8400	1	68.3	21.2	68.3	21.2
819.clvleaf_l	ACC	8400	1	37.4	56.2	35.3	59.5
828.pot3d_l	ACC	8400	1	1560	29.1	1410	32.3
834.hpgmgfv_l	ACC	8400	1	1510	22.2	1400	23.9
835.weather_l	ACC	8400	1	39.3	87.2	37.6	91.0

Hardware Summary
Type of System:	Homogenous Cluster
Compute Node:	IBM Power System AC922
Interconnect:	Mellanox InfiniBand
Compute Nodes Used:	1400
Total Chips:	2800
Total Cores:	30800
Total Threads:	123200
Total Memory:	700 TB

Software Summary
Compiler:	C/C++/Fortran: Version 21.7 of NVHPC Toolkit
MPI Library:	Spectrum MPI Version 10.4.0.3
Other MPI Info:	None
Other Software:	None
Base Parallel Model:	ACC
Base Ranks Run:	8400
Base Threads Run:	1
Peak Parallel Models:	Not Run

Node Description: IBM Power System AC922

Hardware
Number of nodes:	1400
Uses of the node:	compute
Vendor:	IBM
Model:	IBM Power System AC922
CPU Name:	IBM POWER9 2.1 (pvr 004e 1201)
CPU(s) orderable:	2 chips
Chips enabled:	2
Cores enabled:	22
Cores per chip:	44
Threads per core:	4
CPU Characteristics:	Up to 3.8 GHz
CPU MHz:	2300
Primary Cache:	32 KB I + 32 KB D on chip per core
Secondary Cache:	512 KB I+D on chip per core
L3 Cache:	110 MB I+D on chip per chip
Other Cache:	None
Memory:	512 GB (16 x 32 GB RDIMM-DDR4-2666)
Disk Subsystem:	2 x 800 GB (Samsung Electronics Co Ltd NVMe SSD Controller 172Xa/172Xb)
Other Hardware:	None
Accel Count:	4
Accel Model:	Tesla V100-SXM2-16GB
Accel Vendor:	NVIDIA Corporation
Accel Type:	GPU
Accel Connection:	NVLink 2.0
Accel ECC enabled:	Yes
Accel Description:	See Notes
Adapter:	Mellanox ConnectX-5
Number of Adapters:	2
Slot Type:	None
Data Rate:	100 Gb/s (4X EDR)
Ports Used:	2
Interconnect Type:	EDR InfiniBand

Software
Accelerator Driver:	NVIDIA CUDA 450.80.02
Adapter:	Mellanox ConnectX-5
Adapter Driver:	4.9-2.2.4.1
Adapter Firmware:	16.29.1016
Operating System:	Red Hat Enterprise Linux 8.2
Local File System:	xfs
Shared File System:	250 PB IBM Spectrum Scale parallel filesystem over 4X EDR InfiniBand
System State:	Multi-user, run level 3
Other Software:	None

Interconnect Description: Mellanox InfiniBand

Hardware
Vendor:	Mellanox
Model:	Mellanox Switch IB-2
Switch Model:	Mellanox IB EDR Switch IB-2
Number of Switches:	1
Number of Ports:	36
Data Rate:	100 Gb/s
Topology:	Non-blocking Fat-tree
Primary Use:	MPI Traffic and GPFS access

Software

Submit Notes

The config file option 'submit' was used.

General Notes

 MPI startup command:
   jsrun command was used to launch job using 1 GPU/rank.
Detailed information from nvaccelinfo

CUDA Driver Version:           11000
NVRM version:                  NVIDIA UNIX ppc64le Kernel Module  450.80.02  Wed Sep 23 00:55:04 UTC 2020

Device Number:                 0
Device Name:                   Tesla V100-SXM2-16GB
Device Revision Number:        7.0
Global Memory Size:            16911433728
Number of Multiprocessors:     80
Concurrent Copy and Execution: Yes
Total Constant Memory:         65536
Total Shared Memory per Block: 49152
Registers per Block:           65536
Warp Size:                     32
Maximum Threads per Block:     1024
Maximum Block Dimensions:      1024, 1024, 64
Maximum Grid Dimensions:       2147483647 x 65535 x 65535
Maximum Memory Pitch:          2147483647B
Texture Alignment:             512B
Clock Rate:                    1530 MHz
Execution Timeout:             No
Integrated Device:             No
Can Map Host Memory:           Yes
Compute Mode:                  exclusive-process
Concurrent Kernels:            Yes
ECC Enabled:                   Yes
Memory Clock Rate:             877 MHz
Memory Bus Width:              4096 bits
L2 Cache Size:                 6291456 bytes
Max Threads Per SMP:           2048
Async Engines:                 4
Unified Addressing:            Yes
Managed Memory:                Yes
Concurrent Managed Memory:     Yes
Preemption Supported:          Yes
Cooperative Launch:            Yes
  Multi-Device:                Yes
Default Target:                cc70

Compiler Version Notes

==============================================================================
 CC  805.lbm_l(base) 818.tealeaf_l(base) 834.hpgmgfv_l(base)
------------------------------------------------------------------------------
/usr/lib64/crt1.o:(.rodata+0x8): undefined reference to `main'
/usr/bin/ld: link errors found, deleting executable `a.out'
pgacclnk: child process exit status 1: /sw/summit/xalt/1.2.1/bin/ld
nvc 21.7-0 linuxpower target on Linuxpower 
NVIDIA Compilers and Tools
Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
------------------------------------------------------------------------------

==============================================================================
 FC  819.clvleaf_l(base) 828.pot3d_l(base) 835.weather_l(base)
------------------------------------------------------------------------------
nvfortran 21.7-0 linuxpower target on Linuxpower 
NVIDIA Compilers and Tools
Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
------------------------------------------------------------------------------

Base Compiler Invocation

C benchmarks:

mpicc

Fortran benchmarks:

mpif90

Base Optimization Flags

C benchmarks:

-O3 -acc=gpu

Fortran benchmarks:

-O3 -acc=gpu

The flags file that was used to format this result can be browsed at
http://www.spec.org/hpc2021/flags/nv2021_flags.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/hpc2021/flags/nv2021_flags.xml.