The Unbalanced Tree Search (UTS) benchmark is a parallel benchmarking code that reports the performance achieved when performing an exhaustive search on an unbalanced tree. The tree is generated on the fly using a splittable random number generator (RNG) that allows the random stream to be split and processed in parallel while still producing a deterministic tree. The splittable RNG has been constructed using the SHA1 secure hash algorithm. Thus, generating a node's children requires multiple applications of the SHA1 hash algorithm to generate splittable hashes for each child.
The Ohio State University
- Professor P. Sadayappan
- James Dinan
- Gerald Sabin
University of Maryland
- Professor Chau-Wen Tseng
University of North Carolina
- Professor Jan Prins
- Stephen Olivier
The UTS benchmark presents significant load imbalance problems to any parallelization scheme. The size and shape of the tree are random and inferring these parameters from any node in the tree would require a shortcut to solving the SHA-1 secure hash algorithm. Thus, any static work distribution scheme is highly unlikely to yield good performance for UTS.
We have investigated a number of load balancing schemes in the context of UTS. Most schemes fall into two categories: work stealing and work sharing.
- Work Stealing
- Work stealing is a distributed load balancing scheme where idle processors poll their peers in the computation for work. Depending on the programming model stealing may be done in a one-sided manner (e.g. in UPC, OpenMP, Shmem, etc) or in a two sided manner (e.g. in MPI). When a thief finds a victim with work, it steals a portion of the victim's work and is able to continue processing.
- Work Sharing
- Work sharing uses a centralized mechanism for distributing work. Work sharing can be accomplished through the use of work servers or shared work queues. In these schemes, workers send new tasks (in the case of UTS these are tree nodes) to the server or enqueue them in the queue. When a worker exhausts their work they can get more from the server or the shared queue.
Implementations of UTS
Our distribution of UTS include implementations for:
- Cray MTA/XMT
Implementations in additional parallel models include:
- Chapel (Note: This is a reference implementation and has not been tuned for performance/scalability.)
Scalable Dynamic Load Balancing Using UPC PDF Stephen Olivier, Jan Prins. Proc. of 37th International Conference on Parallel Processing (ICPP-08). Portland, OR, September 2008.
A Message Passing Benchmark for Unbalanced Applications DOI James Dinan, Stephen Olivier, Gerald Sabin, Jan Prins, P. Sadayappan, Chau-Wen Tseng J. Simulation Modelling Practice and Theory (SIMPAT). Volume 16, Issue 9, Pages 1177-1189. October, 2008.
Dynamic Load Balancing of Unbalanced Computations Using Message Passing PDF Slides James Dinan, Stephen Olivier, Jan Prins, Gerald Sabin, P Sadayappan and Chau-Wen Tseng. Proc. of 6th Intl. Workshop on Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems (PMEO-PDS 2007). Long Beach, CA, March 26-30, 2007.
UTS: An Unbalanced Tree Search Benchmark PDF Stephen Olivier, Jun Huan, Jinze Liu, Jan Prins, James Dinan, P Sadayappan and Chau-Wen Tseng. Proc. 19th Intl. Workshop on Languages and Compilers for Parallel Computing (LCPC). 2006.
UPC Implementation of an Unbalanced Tree Search Benchmark PDF Jan Prins, Jun Huan, Bill Pugh, Chau-Wen Tseng, P. Sadayappan. UNC Dept. of Computer Science Technical Report 03-034, Oct 2003.
UTS is currently maintained by James Dinan. Please contact James with any questions via email at dinan-at-mcs.anl.gov.