SPEC® MPIM2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

IBM Corporation

IBM Power 575

MPI2007 license: 0005 Test date: Jun-2008
Test sponsor: IBM Corporation Hardware Availability: May-2008
Tested by: IBM Corporation Software Availability: May-2008
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
104.milc 128 199 7.85 197 7.96 195 8.02 128 199 7.85 197 7.96 195 8.02
107.leslie3d 128 336 15.5  335 15.6  338 15.5  128 336 15.5  335 15.6  338 15.5 
113.GemsFDTD 128 452 13.9  394 16.0  452 14.0  128 452 13.9  394 16.0  452 14.0 
115.fds4 128 196 9.94 194 10.0  191 10.2  128 196 9.94 194 10.0  191 10.2 
121.pop2 128 493 8.37 556 7.43 493 8.37 128 493 8.37 556 7.43 493 8.37
122.tachyon 128 461 6.07 455 6.15 461 6.07 128 461 6.07 455 6.15 461 6.07
126.lammps 128 227 12.8  230 12.7  232 12.6  128 227 12.8  230 12.7  232 12.6 
127.wrf2 128 469 16.6  466 16.7  470 16.6  128 469 16.6  466 16.7  470 16.6 
128.GAPgeofem 128 126 16.4  126 16.3  125 16.5  128 126 16.4  126 16.3  125 16.5 
129.tera_tf 128 447 6.19 444 6.24 443 6.25 128 447 6.19 444 6.24 443 6.25
130.socorro 128 188 20.3  241 15.9  186 20.5  128 188 20.3  241 15.9  186 20.5 
132.zeusmp2 128 318 9.75 319 9.72 322 9.63 128 318 9.75 319 9.72 322 9.63
137.lu 128 243 15.1  243 15.1  304 12.1  128 243 15.1  243 15.1  304 12.1 
Hardware Summary
Type of System: Homogeneous
Compute Nodes: IBM Power 575
IBM Power 575
Interconnects: InfiniBand
Gigabit Ethernet
File Server Node: IBM Power 575
Head Node: IBM Power 575
Total Compute Nodes: 2
Total Chips: 32
Total Cores: 64
Total Threads: 128
Total Memory: 256 GB
Base Ranks Run: 128
Minimum Peak Ranks: 128
Maximum Peak Ranks: 128
Software Summary
C Compiler: IBM XL C/C++ Enterprise Edition V9.0
Updated with the Oct2007 PTF
C++ Compiler: IBM XL C/C++ Enterprise Edition V9.0
Updated with the Oct2007 PTF
Fortran Compiler: IBM XL Fortran Enterprise Edition V11.1
Updated with the Oct2007 PTF
Base Pointers: 64-bit
Peak Pointers: 64-bit
MPI Library: IBM Parallel Environment for AIX
V4.3.2.2
Other MPI Info: --
Pre-processors: --
Other Software: None

Node Description: IBM Power 575

Hardware
Number of nodes: 1
Uses of the node: compute, head, fileserver
Vendor: IBM Corporation
Model: IBM Power 575
CPU Name: POWER6
CPU(s) orderable: 32 cores
Chips enabled: 16
Cores enabled: 32
Cores per chip: 2
Threads per core: 2
CPU Characteristics:
CPU MHz: 4700
Primary Cache: 64 KB I + 64 KB D on chip per core
Secondary Cache: 4 MB I+D on chip per core
L3 Cache: 32 MB I+D off chip per chip
Other Cache: None
Memory: 128 GB (64x2 GB) DDR2 533 MHz
Disk Subsystem: 1x146 GB SFF SAS, 10K RPM
Other Hardware: None
Adapter: Integrated
Number of Adapters: 1
Slot Type: --
Data Rate: 1 Gbps
Ports Used: 1
Interconnect Type: Gigabit Ethernet
Adapter: IBM Dual 2-port 4x DDR Host Channel Adapter
Number of Adapters: 2
Slot Type: GX++
Data Rate: 4x DDR 20 Gbps
Ports Used: 4
Interconnect Type: DDR InfiniBand
Software
Adapter: Integrated
Adapter Driver: fileset devices.chrp.IBM.lhea.rte 5.3.8.2
Adapter Firmware: --
Adapter: IBM Dual 2-port 4x DDR Host Channel Adapter
Adapter Driver: fileset devices.common.IBM.ib.rte 5.3.8.2
Adapter Firmware: --
Operating System: IBM AIX V5.3
with the 5300-08-02 Technology Level
Local File System: AIX/JFS2
Shared File System: NFS over ethernet
System State: Multi-user
Other Software: APAR IZ26983
software update for InfiniBand adapter drivers
IBM LoadLeveler for AIX
V3.4.3.2

Node Description: IBM Power 575

Hardware
Number of nodes: 1
Uses of the node: compute
Vendor: IBM Corporation
Model: IBM Power 575
CPU Name: POWER6
CPU(s) orderable: 32 cores
Chips enabled: 16
Cores enabled: 32
Cores per chip: 2
Threads per core: 2
CPU Characteristics:
CPU MHz: 4700
Primary Cache: 64 KB I + 64 KB D on chip per core
Secondary Cache: 4 MB I+D on chip per core
L3 Cache: 32 MB I+D off chip per chip
Other Cache: None
Memory: 128 GB (64x2 GB) DDR2 533 MHz
Disk Subsystem: 1x146 GB SFF SAS, 10K RPM
Other Hardware: None
Adapter: Integrated
Number of Adapters: 1
Slot Type: --
Data Rate: 1 Gbps
Ports Used: 1
Interconnect Type: Gigabit Ethernet
Adapter: IBM Dual 2-port 4x DDR Host Channel Adapter
Number of Adapters: 2
Slot Type: GX++
Data Rate: 4x DDR 20 Gbps
Ports Used: 4
Interconnect Type: DDR InfiniBand
Software
Adapter: Integrated
Adapter Driver: fileset devices.chrp.IBM.lhea.rte 5.3.8.2
Adapter Firmware: --
Adapter: IBM Dual 2-port 4x DDR Host Channel Adapter
Adapter Driver: fileset devices.common.IBM.ib.rte 5.3.8.2
Adapter Firmware: --
Operating System: IBM AIX V5.3
with the 5300-08-02 Technology Level
Local File System: AIX/JFS2
Shared File System: NFS over ethernet
System State: Multi-user
Other Software: APAR IZ26983
software update for InfiniBand adapter drivers
IBM LoadLeveler for AIX
V3.4.3.2

Interconnect Description: InfiniBand

Hardware
Vendor: QLogic
Model: --
Switch Model: QLogic SilverStorm 9024
Number of Switches: 2
Number of Ports: 24
Data Rate: InfiniBand 4x DDR 20 Gbps
Firmware: 4.2.1.1.1
Topology: linear
Primary Use: MPI Communication

Interconnect Description: Gigabit Ethernet

Hardware
Vendor: IBM Corporation
Model: Cisco Systems WS-C6509-E
Catalyst 6500 9-slot Chassis System
Switch Model: Cisco Systems WS-X6748-GE-TX
CEF720 48 port 10/100/1000mb Ethernet card
Cisco Systems WS-SUP720-3B
2 ports Supervisor Engine 720 Rev. 5.2
Number of Switches: 1
Number of Ports: 48
Data Rate: 1 Gbps
Firmware: 01ES330_034_034
Topology: --
Primary Use: File system

General Notes

113.GemsFDTD (base): Applied maxprocandstop src.alt
129.tera_tf (base): Applied fixbuffer src.alt
127.wrf2 (base): Applied fixcalling src.alt
all ulimits set to unlimited
"petaskbind.sh" script used to bind each task to a unique processor
POE Environment variables set before executing benchmarks:
 CWD		     =/specmpi/mpi2007-1.0
 MP_ADAPTER_USE      =shared
 MP_EUILIB           =us
 MP_EUIDEVICE        =sn_all
 MP_SHARED_MEMORY	 =yes
 MP_SINGLE_THREAD	 =yes
 MP_WAIT_MODE        =poll
 MP_EAGER_LIMIT      =65536
 MP_BUFFER_MEM       =67108864
 MP_POLLING_INTERVAL =80000000
 MP_USE_BULK_XFER    =yes
 MP_BULK_MIN_MSG_SIZE=65536
 MP_STDINMODE        =none
 MP_LABELIO          =no
 MP_HOSTFILE         =$CWD/r35.128-2node
Other Environment variables
 MEMORY_AFFINITY     =MCM
 LDR_CNTRL 	     =DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K
 XLFRTEOTPS          =intrinthds=1
submit command uses petaskbind.sh script to bind logical processors to ranks
 poe $CWD/petaskbind.sh $command -procs $ranks
The Gigabit ethernet switch is shared among many nodes, not just the cluster used in this benchmark.

Base Compiler Invocation

C benchmarks:

 /usr/bin/mpcc_r 

C++ benchmarks:

126.lammps:  /usr/bin/mpCC_r 

Fortran benchmarks:

 /usr/bin/mpxlf95_r 

Benchmarks using both Fortran and C:

 /usr/bin/mpcc_r   /usr/bin/mpxlf95_r 

Base Portability Flags

107.leslie3d:  -qfixed 
115.fds4:  -DSPEC_MPI_LC_NO_TRAILING_UNDERSCORE   -qfixed 
121.pop2:  -DSPEC_MPI_AIX 
127.wrf2:  -DNOUNDERSCORE   -DSPEC_MPI_AIX 
130.socorro:  -DSPEC_NO_UNDERSCORE   -qcpluscmt 
132.zeusmp2:  -qfixed   -DSPEC_SINGLE_UNDERSCORE 
137.lu:  -qfixed 

Base Optimization Flags

C benchmarks:

 -O4   -qarch=pwr6   -qtune=pwr6   -q64 

C++ benchmarks:

126.lammps:  -O4   -qarch=pwr6   -qtune=pwr6   -qstrict   -q64 

Fortran benchmarks:

 -O4   -qarch=pwr6   -qtune=pwr6   -qalias=nostd   -q64 

Benchmarks using both Fortran and C:

 -O4   -qarch=pwr6   -qtune=pwr6   -qalias=nostd   -q64 

Base Other Flags

C benchmarks:

 -w   -qsuppress=1500-036   -qipa=noobject   -qipa=threads 

C++ benchmarks:

126.lammps:  -w   -qsuppress=1500-036   -qipa=noobject   -qipa=threads 

Fortran benchmarks:

 -w   -qsuppress=1500-036   -qsuppress=cmpmsg   -qipa=noobject   -qipa=threads 

Benchmarks using both Fortran and C:

 -w   -qsuppress=1500-036   -qsuppress=cmpmsg   -qipa=noobject   -qipa=threads 

Peak Optimization Flags

C benchmarks:

104.milc:  basepeak = yes 
122.tachyon:  basepeak = yes 

C++ benchmarks:

126.lammps:  basepeak = yes 

Fortran benchmarks:

107.leslie3d:  basepeak = yes 
113.GemsFDTD:  basepeak = yes 
129.tera_tf:  basepeak = yes 
137.lu:  basepeak = yes 

Benchmarks using both Fortran and C:

115.fds4:  basepeak = yes 
121.pop2:  basepeak = yes 
127.wrf2:  basepeak = yes 
128.GAPgeofem:  basepeak = yes 
130.socorro:  basepeak = yes 
132.zeusmp2:  basepeak = yes 

The flags files that were used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/MPI2007_flags.20080828.html,
http://www.spec.org/mpi2007/flags/MPI2007_flags.0.20080828.html,
http://www.spec.org/mpi2007/flags/MPI2007_flags.1.html.

You can also download the XML flags sources by saving the following links:
http://www.spec.org/mpi2007/flags/MPI2007_flags.20080828.xml,
http://www.spec.org/mpi2007/flags/MPI2007_flags.0.20080828.xml,
http://www.spec.org/mpi2007/flags/MPI2007_flags.1.xml.