Polybench/Fortran

From HPCRL Wiki
Revision as of 17:44, 30 August 2012 by Rountev (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

News

Description

PolyBench is a collection of benchmarks containing static control parts. The purpose is to uniformize the execution and monitoring of kernels, typically used in past and current publications. PolyBench features include:

  • A single file, tunable at compile-time, used for the kernel instrumentation. It performs extra operations such as cache flushing before the kernel execution, and can set real-time scheduling to prevent OS interference.
  • Non-null data initialization, and live-out data dump.
  • Syntactic constructs to prevent any dead code elimination on the kernel.
  • Parametric loop bounds in the kernels, for general-purpose implementation.
  • Clear kernel marking, using !$pragma scop and !$pragma endscop delimiters.


Available benchmarks (PolyBench/Fortran version 1.0)
Benchmark Description
2mm 2 Matrix Multiplications (D=A.B; E=C.D)
3mm 3 Matrix Multiplications (E=A.B; F=C.D; G=E.F)
adi Alternating Direction Implicit solver
atax Matrix Transpose and Vector Multiplication
bicg BiCG Sub Kernel of BiCGStab Linear Solver
cholesky Cholesky Decomposition
correlation Correlation Computation
covariance Covariance Computation
doitgen Multiresolution analysis kernel (MADNESS)
durbin Toeplitz system solver
dynprog Dynamic programming (2D)
fdtd-2d 2-D Finite Different Time Domain Kernel
fdtd-apml FDTD using Anisotropic Perfectly Matched Layer
gauss-filter Gaussian Filter
gemm Matrix-multiply C=alpha.A.B+beta.C
gemver Vector Multiplication and Matrix Addition
gesummv Scalar, Vector and Matrix Multiplication
gramschmidt Gram-Schmidt decomposition
jacobi-1D 1-D Jacobi stencil computation
jacobi-2D 2-D Jacobi stencil computation
lu LU decomposition
ludcmp LU decomposition
mvt Matrix Vector Product and Transpose
reg-detect 2-D Image processing
seidel 2-D Seidel stencil computation
symm Symmetric matrix-multiply
syr2k Symmetric rank-2k operations
syrk Symmetric rank-k operations
trisolv Triangular solver
trmm Triangular matrix-multiply

Download

Documentation

Copyright (c) 2011-2012 the Ohio State University.

Contact


New in 1.0

  • First release of Polybench/Fortran, based on Polybench/C 3.2


Mailing lists


Available benchmarks

  • linear-algebra
    • linear-algebra/solvers:
      • linear-algebra/kernels/2mm/2mm.F90
      • linear-algebra/kernels/3mm/3mm.F90
      • linear-algebra/kernels/atax/atax.F90
      • linear-algebra/kernels/bicg/bicg.F90
      • linear-algebra/kernels/cholesky/cholesky.F90
      • linear-algebra/kernels/doitgen/doitgen.F90
      • linear-algebra/kernels/gemm/gemm.F90
      • linear-algebra/kernels/gemver/gemver.F90
      • linear-algebra/kernels/gesummv/gesummv.F90
      • linear-algebra/kernels/mvt/mvt.F90
      • linear-algebra/kernels/symm/symm.F90
      • linear-algebra/kernels/syr2k/syr2k.F90
      • linear-algebra/kernels/syrk/syrk.F90
      • linear-algebra/kernels/trisolv/trisolv.F90
      • linear-algebra/kernels/trmm/trmm.F90
    • linear-algebra/solvers:
      • linear-algebra/solvers/durbin/durbin.F90
      • linear-algebra/solvers/dynprog/dynprog.F90
      • linear-algebra/solvers/gramschmidt/gramschmidt.F90
      • linear-algebra/solvers/lu/lu.F90
      • linear-algebra/solvers/ludcmp/ludcmp.F90
  • datamining
    • datamining/correlation/correlation.F90
    • datamining/covariance/covariance.F90
  • medley
    • medley/floyd-warshall/floyd-warshall.F90
    • medley/reg_detect/reg_detect.F90
  • stencils
    • stencils/adi/adi.F90
    • stencils/fdtd-2d/fdtd-2d.F90
    • stencils/fdtd-apml/fdtd-apml.F90
    • stencils/jacobi-1d-imper/jacobi-1d-imper.F90
    • stencils/jacobi-2d-imper/jacobi-2d-imper.F90
    • stencils/seidel-2d/seidel-2d.F90


Sample compilation commands


To compile a benchmark without any monitoring
#  Build utilities first 
$> gcc -c utilities/fpolybench.c -o utilities/fpolybench.o

#  Build the benchmark
$> gfortran -ffree-line-length-none linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_base
To compile a benchmark with execution time reporting
#  Build utilities first 
$> gcc -c -DPOLYBENCH_TIME utilities/fpolybench.c -o utilities/fpolybench.o

#  Build the benchmark
$> gfortran -ffree-line-length-none -DPOLYBENCH_TIME linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_time
To generate the reference output of a benchmark
#  Build utilities first 
$> gcc -c utilities/fpolybench.c -o utilities/fpolybench.o

#  Build the benchmark
$> gfortran -ffree-line-length-none -O0 -DPOLYBENCH_DUMP_ARRAYS linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_ref
$> ./atax_ref 2>atax_ref.out


Some available options


They are all passed as macro definitions during compilation time (e.g, -Dname_of_the_option).

POLYBENCH_TIME

Output execution time (gettimeofday) [default: off]

POLYBENCH_NO_FLUSH_CACHE

Don't flush the cache before calling the timer [default: flush the cache]

POLYBENCH_LINUX_FIFO_SCHEDULER

Use FIFO real-time scheduler for the kernel execution, the program must be run as root, under linux only, and compiled with -lc [default: off]

POLYBENCH_CACHE_SIZE_KB

Cache size to flush, in kB [default: 33MB]

POLYBENCH_STACK_ARRAYS

Use stack allocation instead of malloc [default: off]

POLYBENCH_DUMP_ARRAYS

Dump all live-out arrays on stderr [default: off]

POLYBENCH_CYCLE_ACCURATE_TIMER

Use Time Stamp Counter to monitor the execution time of the kernel [default: off]

POLYBENCH_PAPI

Turn on papi timing (see below).

MINI_DATASET, SMALL_DATASET, STANDARD_DATASET, LARGE_DATASET, EXTRALARGE_DATASET

Set the dataset size to be used [default: STANDARD_DATASET]

POLYBENCH_USE_SCALAR_LB

Use scalar loop bounds instead of parametric ones.

PAPI support


To compile a benchmark with PAPI support
#  Build utilities first 
$> gcc -c utilities/fpolybench.c -o utilities/fpolybench.o -DPOLYBENCH_PAPI

#  Build the benchmark
$> gfortran -ffree-line-length-none -O3 -DPOLYBENCH_PAPI linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_papi
To specify which counter(s) to monitor

Edit utilities/papi_counters.list, and add 1 line per event to monitor. Each line (including the last one) must finish with a ',' and both native and standard events are supported.

The whole kernel is run one time per counter (no multiplexing) and there is no sampling being used for the counter value.


Accurate performance timing


With kernels that have an execution time in the orders of a few tens of milliseconds, it is critical to validate any performance number by repeating several times the experiment. A companion script is available to perform reasonable performance measurement of a PolyBench.

#  Build utilities first 
$> gcc -c -O3 utilities/fpolybench.c -o utilities/fpolybench.o -DPOLYBENCH_TIME

#  Build the benchmark
$> gfortran -ffree-line-length-none -O3 -DPOLYBENCH_TIME linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_time

#  Run the benchmark using the companion script.
$> ./utilities/time_benchmark.sh ./atax_time

This script will run five times the benchmark (that must be a PolyBench compiled with -DPOLYBENCH_TIME), eliminate the two extremal times, and check that the deviation of the three remaining does not exceed a given thresold, set to 5%.

It is also possible to use POLYBENCH_CYCLE_ACCURATE_TIMER to use the Time Stamp Counter instead of gettimeofday() to monitor the number of elapsed cycles.


Generating macro-free benchmark suite


(from the root of the archive:)
$> PARGS="-I utilities -DPOLYBENCH_TIME";
$> for i in `cat utilities/benchmark_list`; do create_pped_version.sh $i "$PARGS"; done

This create for each benchmark file 'xxx.F' a new file 'xxx.preproc.F'. The PARGS variable in the above example can be set to the desired configuration.

e.g

$> PARGS="-I utilities -DPOLYBENCH_STACK_ARRAYS -DPOLYBENCH_USE_SCALAR_LB -DLARGE_DATASET -DPOLYBENCH_TIME";
$> for i in `cat utilities/benchmark_list`; do ./utilities/create_pped_version.sh "$i" "$PARGS"; done

Acknowledgements


This software was produced with support from the Department of Energy's Office of Advanced Scientific Computing under grant DE-SC0005033 and by the National Science Foundation under grant CCF-0811781. Nothing in this work should be construed as reflecting the official policy or position of the US Department of Energy, the US National Science Foundation, the United States government, or the Ohio State University.

Personal tools