Loop tiling, as one of the most important compiler optimization techniques, is beneficial for both parallel machines and uniprocessors. Efficient generation of multi-level tiled code is essential to maximize data reuse in deep memory hierarchies. Tiled loops with parameterized tile sizes (not compile time constants) enable runtime optimizations used in iterative compilation and automatic tuning. Previous parametric multi-level tiling approaches have been restricted to perfectly nested loops, where all statements are contained inside the innermost loop of a loop nest. Previous solutions to tiling for imperfect loop nests have been limited to the case where tile sizes are fixed. PrimeTile provides an effective way to generate efficient parameterized multi-level tiled code for imperfectly nested loops. The generated tiled code contains loops that iterate over full rectangular tiles, making them suitable for compiler optimizations such as register tiling.
PrimeTile 0.2.0 (beta) (updated: March 25th, 2009)
PrimeTile 0.3.0 (beta -- prerelease) (for latest improvements, including register tiling support)
1. Python (widely available in any Linux/Unix distribution). PrimeTile has been tested successfully with Python 2.5.1 (and any newer versions) on various Linux distributions.
2. GMP (GNU Multiple Precision Arithmetic Library -- a portable library written in C for arbitrary-precision arithmetic on integers, rational numbers, and floating-point numbers)
3. PrimeTile consists of two major components: an adapted version of CLooG and Orio. Both are already included in the PrimeTile package. Nothing needs to be downloaded separately.
To install PrimeTile, we only need to build the CLooG code that is already included in the package.
% tar -xvzf primetile-X.X.X.tar.gz % cd primetile-X.X.X % cd cloog % ./configure --without-polylib --with-isl=bundled --with-gmp-prefix=/path/to/gmp/installation % make % ./check
Note that the GMP installation directory must be explicitly specified during configuration. The last step is used for ensuring whether the adapted version of CLooG has been installed properly.
Command Line Options
primetile is the script wrapper around all PrimeTile's components.
% ./primetile --help Description: compile script for PrimeTile Usage: primetile [options] <ifile> <ifile> Input file containing the context, the statement domains, and the scattering functions Options for code generation: -t <level> | --tilelevel=<level> The number of tiling levels (0: no tiling) (default setting: 1) -b <level> | --boundarylevel=<level> The largest tiling level used for tiling the boundary tiles (0: no boundary-tile tiling, -1: tilelevel-1) (default setting: -1) -f <depth> | --first=<depth> The first loop depth to start tiling (-1: infinity) (default setting: 1) -l <depth> | --last=<depth> The last loop depth to stop tiling (-1: infinity) (default setting: -1) General options: -o <file> | --output=<file> Place the output to <file> -h | --help Display this usage message -q | --quiet Don't print any details of the running program
Adapted Version of CLooG
PrimeTile requires an adapted version of CLooG that can generate loop code containing redundant "one-time" loops so that all statements in every loop nest have as many surrounding loops as the dimensionality of the corresponding iteration space. The source code of the adapted CLooG can be downloaded via the following link.
Modified CLooG (developmental version 0.14.0-132)
To build the modified version of CLooG:
% tar -xvzf cloog.tar.gz % cd cloog % ./configure --without-polylib --with-isl=bundled --with-gmp-prefix=/path/to/gmp/installation % make % ./check
The GMP installation directory must be specified during configuration. The last step is to ensure that the installation succeeds.
Important note: The latest version of CLooG now has -otl command line option that allows the generation of one-time loops. This CLooG's feature was non existent when the first version of PrimeTile was released. Thanks to CLooG developers (Dr. Cédric Bastoul, Dr. Sven Verdoolaege, etc) for adding this feature.
Download archive file for original PrimeTile webpage that contains more comprehensive explanation of PrimeTile and some usage examples.
1. Dynamic Selection of Tile Sizes. Sanket Tavarageri, Louis-No¨el Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan High Performance Computing Conference (HiPC 2011), Dec 2011, Bangalore, India.
2. Parametric Tiling of Affine Loop Nests. Sanket Tavarageri, Albert Hartono, Muthu Manikandan Baskaran, Louis-Noël Pouchet, J. Ramanujam, and P. Sadayappan. Workshop on Compilers for Parallel Computing (CPC), Jul 2010, Vienna University of Technology, Vienna, Austria.
3. Parameterized Tiling Revisited. Muthu Manikandan Baskaran, Albert Hartono, Sanket Tavarageri, Thomas Henretty, J. Ramanujam, and P. Sadayappan. IEEE/ACM International Symposium on Code Generation and Optimization (CGO), Apr 2010, Toronto, Canada.
4. DynTile: Parametric Tiled Loop Generation for Effective Parallel Execution on Multicore Processors. Albert Hartono, Muthu Manikandan Baskaran, J. Ramanujam, and P. Sadayappan. IEEE International Parallel and Distributed Processing Symposium (IPDPS), Apr 2010, Atlanta, Georgia.
5. Parametric Multi-Level Tiling of Imperfectly Nested Loops. Albert Hartono, Muthu Manikandan Baskaran, Cédric Bastoul, Albert Cohen, Sriram Krishnamoorthy, Boyana Norris, J. Ramanujam, and P. Sadayappan. ACM International Conference on Supercomputing (ICS), June 2009, IBM T.J. Watson Research Center, Yorktown Heights, New York.
This work was supported in part by the National Science Foundation through awards 0403342, 0508245, 0509442, 0509467, 0541409, 0811457, and 0811781, and a State of Ohio Development Fund.
Please feel free to contact Albert Hartono at albert.hartono@( gmail | intel ).com for questions.