TCE Publications

From HPCRL Wiki
Revision as of 05:18, 14 December 2007 by 75.187.42.246 (Talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


2007

  • Efficient Search-Space Pruning for Integrated Fusion and Tiling Transformations. X. Gao, S. Krishnamoorthy, S. Sahoo, C. Lam, G. Baumgartner, J. Ramanujam, and P. Sadayappan. Concurrency and Computation: Practice and Experience, 2007 (Submitted).

2006

  • Hypergraph Partitioning for Automatic Memory Hierarchy Management. S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, and P. Sadayappan. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. November 2006 (SC '06), November 2006.
  • Design and Implementation of a One-Sided Communication Interface for the IBM eServer Blue Gene Supercomputer. M. Blocksome, C. Archer, T. Inglett, P. McCarthy, M. Mundy, J. Ratterman, A. Sidelnik, B. Smith, G. Almasi, J. Castanos, D. Lieber, J. Moreira, S. Krishnamoorthy, and V. Tipparaju. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. November 2006 (SC '06), November 2006.
  • Combining Analytical and Empirical Approaches in Tuning Matrix Transposition. Q. Lu, S. Krishnamoorthy, P. Sadayappan. Proceedings of the 15th International Conference on Parallel Architecture and Compilation Techniques (PACT '06), pp. 233-242, September 2006.
  • An Integrated Approach for Processor Allocation and Scheduling of Mixed-Parallel Applications. N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and J. Saltz. Proceedings of the 35th International Conference on Parallel Processing (ICPP '06), August 2006.
  • Identifying Cost-Effective Common Subexpressions to Reduce Operation Count in Tensor Contraction Evaluations. A. Hartono, Q. Lu, X. Gao, S. Krishnamoorthy, M. Nooijen, G. Baumgartner, V. Choppella, D. E. Bernholdt, R. M. Pitzer, J. Ramanujam, A. Rountev, and P. Sadayappan. Proceedings of the 6th International Conference on Computational Science (ICCS 2006), May 2006.
  • Efficient Synthesis of Out-of-Core Algorithms Using a Nonlinear Optimization Solver. S. Krishnan, S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Ramanujam, P. Sadayappan, V. Choppella. Journal of Parallel and Distributed Computing, vol:66(5) pp. 659--673, May 2006.
  • Layout Transformation Support for the Disk Resident Arrays Framework. S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha, and P. Sadayappan. Journal of Supercomputing, vol:36(2) pp. 153--170, May 2006.
  • An Approach to Locality-Conscious Load Balancing and Transparent Memory Hierarchy Management with a Global-Address-Space Parallel Programming Model . S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, and P. Sadayappan. Proceedings of the IPDPS Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL '06), April 2006.
  • An Extensible Global Address Space Framework with Decoupled Task and Data Abstractions. S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, A. Rountev, and P. Sadayappan. Proceedings of the IPDPS Workshop on Next Generation Software (NGS '06), April 2006.
  • Automatic Code Generation for Many-Body Electronic Structure Methods: The Tensor Contraction Engine. A. Auer, G. Baumgartner, D.E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R.J. Harrison, A. Hartono, S. Krishnamoorthy, S. Krishnan, C. Lam, Q. Lu, M. Nooijen, R.M. Pitzer, J. Ramanujam, P. Sadayappan, A. Sibiryakov. Molecular Physics, vol. 104, no. 2, pp. 211--228, January 2006.
  • Search-Based Performance-Model Driven Optimization for Compilation of Tensor Contraction Expressions. X. Gao, S. Krishnamoorthy, Q. Lu, V. Choppella, G. Baumgartner, J. Ramanujam, and P. Sadayappan. The 12th Workshop on Compilers for Parallel Computers (CPC '06). Coruna, Spain, January 2006.

2005

  • Data and Computation Abstractions for Dynamic and Irregular Computations. S. Krishnamoorthy, J. Nieplocha, P. Sadayappan. Proceedings of the 12th Annual International Conference on High Performance Computing (HiPC '05), December 2005.
  • Integrated Loop Optimizations for Data Locality Enhancement of Tensor Contraction Expressions. S. K. Sahoo, S. Krishnamoorthy, R. Panuganti, P. Sadayappan. Proceedings of Supercomputing (SC '05), November 2005.
  • Efficient Search-Space Pruning for Integrated Fusion and Tiling Transformations. X. Gao, S. Krishnamoorthy, S. K. Sahoo, C. Lam, G. Baumgartner, J. Ramanujam, P. Sadayappan. Proceedings of the 18th International Workshop on Languages and Compilers for Parallel Computing (LCPC '05), October 2005.
  • Performance Modeling and Optimization of Parallel Out-of-Core Tensor Contractions. X. Gao, S.K. Sahoo, Q. Lu, G. Baumgartner, C. Lam, J. Ramanujam, P. Sadayappan. Proceedings of the ACM SIGPLAN 2005 Symposium on Principles and Practice of Parallel Programming, Chicago, Illinois, June 2005.
  • Automated Operation Minimization of Tensor Contraction Expressions in Electronic Structure Calculations. A. Hartono, A. Sibiryakov, M. Nooijen, G. Baumgartner, D.E. Bernholdt, S. Hirata, C. Lam, R.M. Pitzer, J. Ramanujam, P. Sadayappan. Proceedings of the International Conference on Computational Science 2005 (ICCS 2005), Atlanta, Georgia, 22-25 May 2005.
  • Locality-aware Load Balancing for Dynamic and Irregular Computations. S. Krishnamoorthy, P. Sadayappan, J. Nieplocha, and M. Krishnan. Workshop on Patterns in High Performance Computing, May 2005.
  • Cache Miss Characterization and Data Locality Optimization for Imperfectly Nested Loops on Shared Memory Multiprocessors. S. K. Sahoo, R. Panuganti, S. Krishnamoorthy, P. Sadayappan. Proceedings of the 19th IEEE International Parallel & Distributed Processing Symposium (IPDPS 05), April 2005.
  • Synthesis of High-Performance Parallel Programs for a Class of Ab Initio Quantum Chemistry Models. G. Baumgartner, A. Auer, D.E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R.J. Harrison, S. Hirata, S. Krishnamoorthy, S. Krishnan, C. Lam, Q. Lu, M. Nooijen, R.M. Pitzer, J. Ramanujam, P. Sadayappan, A. Sibiryakov. Proceedings of the IEEE, vol. 93, no. 2, February 2005, pp. 276-292.

2004

  • Efficient Layout Transformation Support for Disk-based Multidimensional Arrays. S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha, P. Sadayappan. Proceedings of the 11th Annual International Conference on High-Performance Computing (HiPC '04), Bangalore, India, 19-22 December 2004. In Lecture Notes in Computer Science, Vol. 3296, Springer-Verlag, pp. 386-398.
  • Layout Transformation Support for the Disk Resident Arrays Framework. S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha, P. Sadayappan. Proceedings of the Los Alamos Computer Science Initiative Symposium (LACSI '04), Santa Fe, New Mexico, 12-14 October 2004.
  • Empirical Performance-Model Driven Data Layout Optimization. Q. Lu, X. Gao, S. Krishnamoorthy, G. Baumgartner, J. Ramanujam, P. Sadayappan. Proceedings of Languages and Compilers for Parallel Computing (LCPC), West Lafayette, Indiana, 22-25 September 2004.
  • A High-Level Approach to Synthesis of High-Performance Codes for Quantum Chemistry: The Tensor Contraction Engine. G. Baumgartner, D.E. Bernholdt, V. Choppella, J. Ramanujam, P. Sadayappan. Proceedings of the 11th Workshop on Compilers for Parallel Computers (CPC 2004), Chiemsee, Germany, 7-9 July 2004, pp. 281-290.
  • Efficient Synthesis of Out-of-Core Algorithms Using a Nonlinear Optimization Solver. S. Krishnan, S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Ramanujam, P. Sadayappan, V. Choppella. Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS '04), Santa Fe, New Mexico, 26-30 April 2004, Abstract p. 34b, 10 pages. Best paper award
  • Efficient Parallel Out-of-Core Matrix Transposition. S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam, P. Sadayappan. International Journal on High Performance Computing and Networking, vol:2(2/3/4) pp:110--119 2004. A previous version of this paper has appeared in the Proceedings of the IEEE International Conference on Cluster Computing (Cluster '03), Hong Kong, China, 1-4 December 2003, IEEE Computer Society Press, pp. 300-307.

2003

  • Data Locality Optimization for Synthesis of Efficient Out-of-Core Algorithms. S. Krishnan, S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam, P. Sadayappan, J. Ramanujam, D.E. Bernholdt, V. Choppella. Proceedings of the International Conference on High-Performance Computing (HiPC '03), Hyderabad, India, 17-20 December 2003. In Lecture Notes in Computer Science, Vol. 2913, Springer-Verlag, pp. 406-417. Best paper award.
  • Efficient Parallel Out-of-Core Matrix Transposition. S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam, P. Sadayappan. Proceedings of the IEEE International Conference on Cluster Computing (Cluster '03), Hong Kong, China, 1-4 December 2003, IEEE Computer Society Press, pp. 300-307. An extended version of this paper will appear in International Journal on High Performance Computing and Networking, 2004.
  • Memory-Constrained Data Locality Optimization for Tensor Contractions. A. Bibireata, S. Krishnan, G. Baumgartner, D. Cociorva, C. Lam, P. Sadayappan, J. Ramanujam, D.E. Bernholdt, V. Choppella. In L. Rauchwerger (ed.), Proceedings of the 16th International Workshop on Languages and Compilers for Parallel Computing (LCPC '03), College Station, Texas, 2-4 October 2003, Springer-Verlag, Lecture Notes in Computer Science, Vol. 2958, 2004, pp. 93-108.
  • On Efficient Out-of-Core Matrix Transposition. S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam, P. Sadayappan. Technical Report OSU-CISRC-9/03-TR52, Dept. of Computer and Information Science, The Ohio State University, September 2003.
  • Global Communication Optimization for Tensor Contraction Expressions under Memory Constraints. D. Cociorva, X. Gao, S. Krishnan, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam. Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS '03), Nice, France, 22-26 April 2003, Abstract p. 37b, 8 pages.
  • Compile-Time Optimizations for Tensor Contraction Expressions. G. Baumgartner, D. Cociorva, C. Lam, P. Sadayappan, J. Ramanujam. Proceedings of Compilers for Parallel Computers (CPC '03), Amsterdam, The Netherlands, 8-10 January, 2003.

2002

  • A High-Level Approach to Synthesis of High-Performance Codes for Quantum Chemistry. G. Baumgartner, D.E. Bernholdt, D. Cociorva, R.J. Harrison, S. Hirata, C. Lam, M. Nooijen, R.M. Pitzer, J. Ramanujam, P. Sadayappan. Proceedings of Supercomputing 2002, Baltimore, Maryland, 16-22 November 2002. IEEE Computer Society Press, Abstract p. 5, 10 pages.
  • Memory-Constrained Communication Minimization for a Class of Array Computations. D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam. Proceedings of the 15th International Workshop on Languages and Compilers for Parallel Computing (LCPC '02), College Park, Maryland, 25-27 July 2002.
  • Automatic Synthesis of High-Performance Codes for Quantum Chemistry Applications. G. Baumgartner, D.E. Bernholdt, D. Cociorva, R.J. Harrison, C. Lam, M. Nooijen, J. Ramanujam, P. Sadayappan. Proceedings of the Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL '02), New York, New York, 22 June 2002.
  • Memory-Optimal Evaluation of Expression Trees Involving Large Objects. C. Lam, T. Rauber, G. Baumgartner, D. Cociorva, P. Sadayappan. ACM Transactions on Programming Languages and Systems (TOPLAS), 2002. Currently under revision.
  • Space-Time Trade-Off Optimization for a Class of Electronic Structure Calculations. D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam, M. Nooijen, D.E. Bernholdt, R.J. Harrison. Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI '02), Berlin, Germany, 17-19 June 2002, pp. 177-186.
  • A Performance Optimization Framework for Compilation of Tensor Contraction Expressions into Parallel Programs. G. Baumgartner, D.E. Bernholdt, D. Cociorva, R.J. Harrison, C. Lam, M. Nooijen, J. Ramanujam, P. Sadayappan. 7th International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS '02), Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS '02), Fort Lauderdale, Florida, 15 April 2002, IEEE Computer Society, pp. 106-114.

2001

  • Towards Automatic Synthesis of High-Performance Codes for Electronic Structure Calculations: Data Locality Optimization. D. Cociorva, J. Wilkins, G. Baumgartner, P. Sadayappan, J. Ramanujam, M. Nooijen, D.E. Bernholdt, R.J. Harrison. Proceedings of the International Conference on High-Performance Computing (HiPC '01), Hyderabad, India, 17-21 December 2001, Springer-Verlag, Lecture Nodes in Computer Science, Vol. 2228, pp. 237-248.
  • Loop Optimizations for a Class of Memory-Constrained Computations. D. Cociorva, J. Wilkins, C. Lam, G. Baumgartner, P. Sadayappan, J. Ramanujam. Proceedings of the 15th ACM International Conference on Supercomputing (ICS '01), Sorrento, Italy, 16-21 June 2001, pp. 103-113.

1999

  • Memory-Optimal Evaluation of Expression Trees Involving Large Objects. C. Lam, D. Cociorva, G. Baumgartner, P. Sadayappan. Proceedings of the 1999 International Conference on High Performance Computing (HiPC '99), Calcutta, India, 17-20 December 1999, IEEE Computer Society, Springer-Verlag, Lecture Notes in Computer Science, Vol. 1745, pp. 103-110.
  • Optimization of Memory Usage and Communication Requirements for a Class of Loops Implementing Multi-Dimensional Integrals. C. Lam, D. Cociorva, G. Baumgartner, P. Sadayappan, Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing (LCPC '99), La Jolla, California, 4-6 August 1999, Springer-Verlag, Lecture Notes in Computer Science, Vol. 1863, pp. 350-364.
Personal tools