Polybench/Fortran

From HPCRL Wiki
(Difference between revisions)
Jump to: navigation, search
(Documentation)
(Documentation)
Line 166: Line 166:
 
</pre>
 
</pre>
  
** To compile a benchmark with execution time reporting:
+
====== To compile a benchmark with execution time reporting ======
--------------------------------------------------------
+
<pre>
 
+
 
#  Build utilities first  
 
#  Build utilities first  
 
$> gcc -c -DPOLYBENCH_TIME utilities/fpolybench.c -o utilities/fpolybench.o
 
$> gcc -c -DPOLYBENCH_TIME utilities/fpolybench.c -o utilities/fpolybench.o
Line 176: Line 175:
 
</pre>
 
</pre>
  
** To generate the reference output of a benchmark:
+
====== To generate the reference output of a benchmark ======
---------------------------------------------------
+
<pre>
 
+
 
#  Build utilities first  
 
#  Build utilities first  
 
$> gcc -c utilities/fpolybench.c -o utilities/fpolybench.o
 
$> gcc -c utilities/fpolybench.c -o utilities/fpolybench.o
Line 185: Line 183:
 
$> gfortran -ffree-line-length-none -O0 -DPOLYBENCH_DUMP_ARRAYS linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_ref
 
$> gfortran -ffree-line-length-none -O0 -DPOLYBENCH_DUMP_ARRAYS linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_ref
 
$> ./atax_ref 2>atax_ref.out
 
$> ./atax_ref 2>atax_ref.out
 +
</pre>
  
  
  
  
-------------------------
+
==== Some available options ====
* Some available options:
+
-------------------------
+
  
 
They are all passed as macro definitions during compilation time (e.g,
 
They are all passed as macro definitions during compilation time (e.g,
 
-Dname_of_the_option).
 
-Dname_of_the_option).
  
- POLYBENCH_TIME: output execution time (gettimeofday) [default: off]
+
* POLYBENCH_TIME: output execution time (gettimeofday) [default: off]
 
+
* POLYBENCH_NO_FLUSH_CACHE: don't flush the cache before calling the
- POLYBENCH_NO_FLUSH_CACHE: don't flush the cache before calling the
+
 
   timer [default: flush the cache]
 
   timer [default: flush the cache]
 
+
* POLYBENCH_LINUX_FIFO_SCHEDULER: use FIFO real-time scheduler for the
- POLYBENCH_LINUX_FIFO_SCHEDULER: use FIFO real-time scheduler for the
+
 
   kernel execution, the program must be run as root, under linux only,
 
   kernel execution, the program must be run as root, under linux only,
 
   and compiled with -lc [default: off]
 
   and compiled with -lc [default: off]
 
+
* POLYBENCH_CACHE_SIZE_KB: cache size to flush, in kB [default: 33MB]
- POLYBENCH_CACHE_SIZE_KB: cache size to flush, in kB [default: 33MB]
+
* POLYBENCH_STACK_ARRAYS: use stack allocation instead of malloc [default: off]
 
+
* POLYBENCH_DUMP_ARRAYS: dump all live-out arrays on stderr [default: off]
- POLYBENCH_STACK_ARRAYS: use stack allocation instead of malloc [default: off]
+
* POLYBENCH_CYCLE_ACCURATE_TIMER: Use Time Stamp Counter to monitor
 
+
- POLYBENCH_DUMP_ARRAYS: dump all live-out arrays on stderr [default: off]
+
 
+
- POLYBENCH_CYCLE_ACCURATE_TIMER: Use Time Stamp Counter to monitor
+
 
   the execution time of the kernel [default: off]
 
   the execution time of the kernel [default: off]
 
+
* POLYBENCH_PAPI: turn on papi timing (see below).
- POLYBENCH_PAPI: turn on papi timing (see below).
+
* MINI_DATASET, SMALL_DATASET, STANDARD_DATASET, LARGE_DATASET,
 
+
- MINI_DATASET, SMALL_DATASET, STANDARD_DATASET, LARGE_DATASET,
+
 
   EXTRALARGE_DATASET: set the dataset size to be used
 
   EXTRALARGE_DATASET: set the dataset size to be used
 
   [default: STANDARD_DATASET]
 
   [default: STANDARD_DATASET]
 
+
* POLYBENCH_USE_SCALAR_LB: Use scalar loop bounds instead of parametric ones.
- POLYBENCH_USE_SCALAR_LB: Use scalar loop bounds instead of parametric ones.
+
  
  

Revision as of 17:42, 29 March 2012

Contents

News


Description

PolyBench is a collection of benchmarks containing static control parts. The purpose is to uniformize the execution and monitoring of kernels, typically used in past and current publications. PolyBench features include:

  • A single file, tunable at compile-time, used for the kernel instrumentation. It performs extra operations such as cache flushing before the kernel execution, and can set real-time scheduling to prevent OS interference.
  • Non-null data initialization, and live-out data dump.
  • Syntactic constructs to prevent any dead code elimination on the kernel.
  • Parametric loop bounds in the kernels, for general-purpose implementation.
  • Clear kernel marking, using !$pragma scop and !$pragma endscop delimiters.


Available benchmarks (PolyBench/Fortran version 1.0)
Benchmark Description
2mm 2 Matrix Multiplications (D=A.B; E=C.D)
3mm 3 Matrix Multiplications (E=A.B; F=C.D; G=E.F)
adi Alternating Direction Implicit solver
atax Matrix Transpose and Vector Multiplication
bicg BiCG Sub Kernel of BiCGStab Linear Solver
cholesky Cholesky Decomposition
correlation Correlation Computation
covariance Covariance Computation
doitgen Multiresolution analysis kernel (MADNESS)
durbin Toeplitz system solver
dynprog Dynamic programming (2D)
fdtd-2d 2-D Finite Different Time Domain Kernel
fdtd-apml FDTD using Anisotropic Perfectly Matched Layer
gauss-filter Gaussian Filter
gemm Matrix-multiply C=alpha.A.B+beta.C
gemver Vector Multiplication and Matrix Addition
gesummv Scalar, Vector and Matrix Multiplication
gramschmidt Gram-Schmidt decomposition
jacobi-1D 1-D Jacobi stencil computation
jacobi-2D 2-D Jacobi stencil computation
lu LU decomposition
ludcmp LU decomposition
mvt Matrix Vector Product and Transpose
reg-detect 2-D Image processing
seidel 2-D Seidel stencil computation
symm Symmetric matrix-multiply
syr2k Symmetric rank-2k operations
syrk Symmetric rank-k operations
trisolv Triangular solver
trmm Triangular matrix-multiply

Download

Documentation

Copyright (c) 2011-2012 the Ohio State University.

Contact


New in 1.0

  • First release of Polybench/Fortran, based on Polybench/C 3.2


Mailing lists


Available benchmarks

  • linear-algebra
    • linear-algebra/solvers:
      • linear-algebra/kernels/2mm/2mm.F90
      • linear-algebra/kernels/3mm/3mm.F90
      • linear-algebra/kernels/atax/atax.F90
      • linear-algebra/kernels/bicg/bicg.F90
      • linear-algebra/kernels/cholesky/cholesky.F90
      • linear-algebra/kernels/doitgen/doitgen.F90
      • linear-algebra/kernels/gemm/gemm.F90
      • linear-algebra/kernels/gemver/gemver.F90
      • linear-algebra/kernels/gesummv/gesummv.F90
      • linear-algebra/kernels/mvt/mvt.F90
      • linear-algebra/kernels/symm/symm.F90
      • linear-algebra/kernels/syr2k/syr2k.F90
      • linear-algebra/kernels/syrk/syrk.F90
      • linear-algebra/kernels/trisolv/trisolv.F90
      • linear-algebra/kernels/trmm/trmm.F90
    • linear-algebra/solvers:
      • linear-algebra/solvers/durbin/durbin.F90
      • linear-algebra/solvers/dynprog/dynprog.F90
      • linear-algebra/solvers/gramschmidt/gramschmidt.F90
      • linear-algebra/solvers/lu/lu.F90
      • linear-algebra/solvers/ludcmp/ludcmp.F90
  • datamining
    • datamining/correlation/correlation.F90
    • datamining/covariance/covariance.F90
  • medley
    • medley/floyd-warshall/floyd-warshall.F90
    • medley/reg_detect/reg_detect.F90
  • stencils
    • stencils/adi/adi.F90
    • stencils/fdtd-2d/fdtd-2d.F90
    • stencils/fdtd-apml/fdtd-apml.F90
    • stencils/jacobi-1d-imper/jacobi-1d-imper.F90
    • stencils/jacobi-2d-imper/jacobi-2d-imper.F90
    • stencils/seidel-2d/seidel-2d.F90



Sample compilation commands

To compile a benchmark without any monitoring
#  Build utilities first 
$> gcc -c utilities/fpolybench.c -o utilities/fpolybench.o

#  Build the benchmark
$> gfortran -ffree-line-length-none linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_base
To compile a benchmark with execution time reporting
#  Build utilities first 
$> gcc -c -DPOLYBENCH_TIME utilities/fpolybench.c -o utilities/fpolybench.o

#  Build the benchmark
$> gfortran -ffree-line-length-none -DPOLYBENCH_TIME linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_time
To generate the reference output of a benchmark
#  Build utilities first 
$> gcc -c utilities/fpolybench.c -o utilities/fpolybench.o

#  Build the benchmark
$> gfortran -ffree-line-length-none -O0 -DPOLYBENCH_DUMP_ARRAYS linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_ref
$> ./atax_ref 2>atax_ref.out



Some available options

They are all passed as macro definitions during compilation time (e.g, -Dname_of_the_option).

  • POLYBENCH_TIME: output execution time (gettimeofday) [default: off]
  • POLYBENCH_NO_FLUSH_CACHE: don't flush the cache before calling the
 timer [default: flush the cache]
  • POLYBENCH_LINUX_FIFO_SCHEDULER: use FIFO real-time scheduler for the
 kernel execution, the program must be run as root, under linux only,
 and compiled with -lc [default: off]
  • POLYBENCH_CACHE_SIZE_KB: cache size to flush, in kB [default: 33MB]
  • POLYBENCH_STACK_ARRAYS: use stack allocation instead of malloc [default: off]
  • POLYBENCH_DUMP_ARRAYS: dump all live-out arrays on stderr [default: off]
  • POLYBENCH_CYCLE_ACCURATE_TIMER: Use Time Stamp Counter to monitor
 the execution time of the kernel [default: off]
  • POLYBENCH_PAPI: turn on papi timing (see below).
  • MINI_DATASET, SMALL_DATASET, STANDARD_DATASET, LARGE_DATASET,
 EXTRALARGE_DATASET: set the dataset size to be used
 [default: STANDARD_DATASET]
  • POLYBENCH_USE_SCALAR_LB: Use scalar loop bounds instead of parametric ones.



  • PAPI support:

    • To compile a benchmark with PAPI support:

  1. Build utilities first

$> gcc -c utilities/fpolybench.c -o utilities/fpolybench.o -DPOLYBENCH_PAPI

  1. Build the benchmark

$> gfortran -ffree-line-length-none -O3 -DPOLYBENCH_PAPI linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_papi


    • To specify which counter(s) to monitor:

Edit utilities/papi_counters.list, and add 1 line per event to monitor. Each line (including the last one) must finish with a ',' and both native and standard events are supported.

The whole kernel is run one time per counter (no multiplexing) and there is no sampling being used for the counter value.



  • Accurate performance timing:

With kernels that have an execution time in the orders of a few tens of milliseconds, it is critical to validate any performance number by repeating several times the experiment. A companion script is available to perform reasonable performance measurement of a PolyBench.

  1. Build utilities first

$> gcc -c -O3 utilities/fpolybench.c -o utilities/fpolybench.o -DPOLYBENCH_TIME

  1. Build the benchmark

$> gfortran -ffree-line-length-none -O3 -DPOLYBENCH_TIME linear-algebra/kernels/atax/atax.F90 -Iutilities utilities/fpolybench.o -o atax_time

  1. Run the benchmark using the companion script.

$> ./utilities/time_benchmark.sh ./atax_time

This script will run five times the benchmark (that must be a PolyBench compiled with -DPOLYBENCH_TIME), eliminate the two extremal times, and check that the deviation of the three remaining does not exceed a given thresold, set to 5%.

It is also possible to use POLYBENCH_CYCLE_ACCURATE_TIMER to use the Time Stamp Counter instead of gettimeofday() to monitor the number of elapsed cycles.




  • Generating macro-free benchmark suite:

(from the root of the archive:) $> PARGS="-I utilities -DPOLYBENCH_TIME"; $> for i in `cat utilities/benchmark_list`; do create_pped_version.sh $i "$PARGS"; done

This create for each benchmark file 'xxx.F' a new file 'xxx.preproc.F'. The PARGS variable in the above example can be set to the desired configuration.

e.g $> PARGS="-I utilities -DPOLYBENCH_STACK_ARRAYS -DPOLYBENCH_USE_SCALAR_LB -DLARGE_DATASET -DPOLYBENCH_TIME"; $> for i in `cat utilities/benchmark_list`; do ./utilities/create_pped_version.sh "$i" "$PARGS"; done




  • Acknowledgements:

This software was produced with support from the Department of Energy's Office of Advanced Scientific Computing under grant DE-SC0005033 and by the National Science Foundation under grant CCF-0811781. Nothing in this work should be construed as reflecting the official policy or position of the Department of Energy, the National Science Foundation, the United States government, or the Ohio State University. </pre>

Personal tools