# 352.ep SPEC ACCEL Benchmark Description File

352.ep

## Benchmark Author

Serial C version was developed the Center for Manycore Programming at Seoul National University and derived from the serial Fortran versions in "NPB3.3-SER" developed by NAS.

OpenACC version was developed by Rengan Xu and Sunita Chandrasekaran from University of Houston and Mathew Colgrove from NVIDIA.

## Benchmark Program General Category

Embarrassingly Parallel

## Benchmark Description

EP kernel benchmark is an embarrassingly parallel algorithm with a reduction. The algorithm generates n pairs of uniform (0,1) pseudorandom deviates (xj,yj). Then for each j the condition tj = x2j + yj2 <= 1 is checked. If the condition is satisfied, Xk = xj sqrt(-2log(tj))/tj and Yk = yj sqrt(-2log(tj))/tj , where k starts from 1 and increments after each step. Finally Ql (0 <= l <= 9) counts the pairs (Xk,Yk) that lie in the square annulus l <= max(|Xk, Yk|) <= l + 1. Then Sum(Xk) + Sum(Yk) are then calculated. In this algorithm, Ql(0 <= l <= 9) performs the reduction of all the pairs.

## Input Description

The input dataset size is comprised of W, A through E classes. We have used the 3 classes in our experiments:

Class W: reference data for n = 2^25 pairs of (xj,yj) (1 <= j <= n)

Class C: reference data for n = 2^32 pairs of (xj,yj) (1 <= j <= n)

Class D: references data for n = 2^36 pairs of (xj,yj) (1 <= j <= n)

Class W is used by the test workload, Class C by train, and Class D by ref.

## Output Description

Ql (0 <= l <= 9) that counts the pairs (Xk,Yk) that lie in the square annulus l <= max(|Xk, Yk|) <= l + 1, and Sum(Xk) + Sum(Yk).

C