The 2005 IEEE International Conference on Cluster Computing
 
Burlington Marriott, Burlington, MA, USA

 

Tutorial Title

Parallel I/O for Scientific Applications

Level

 

This tutorial is intended for an audience interested in learning how to best use I/O resources on parallel computers.  We assume some proficiency in MPI, but no experience with parallel file systems or any of the I/O interfaces or libraries that we will cover.

 

We the content is approximately 30% beginner, 40% intermediate, and 30% advanced.

 

Duration

 

A  half-day.

 

Presenters

 

Robert Latham

robl@mcs.anl.gov

 

Robert B. Ross

rross@mcs.anl.gov

 

Abstract

 

For all their hype, parallel file systems alone are not sufficient for achieving high-performance and usable I/O on large clusters.  In addition to the performance that a parallel file system can afford, functionality is also needed for mapping high-level or domain specific application abstractions into file system ones and for managing the many processes that make up these applications.

 

Because of these additional demands, high-performance I/O (HPIO) systems in parallel machines have evolved into multi-tiered solutions.  At the top are high-level I/O libraries such as HDF5 and PnetCDF that provide application developers familiar interfaces and abstractions such as multidimensional typed variables. Below this layer sits MPI-IO, providing management of groups of processes and aggregation of operations performed

by these groups.  The parallel file system sits at the lowest software layer and provides a unified view of a large number of I/O resources and manages concurrent access to these resources.

 

This talk will focus on the application programmer's point of view.  We will discuss how the layers of this software stack lead to an effective solution, what happens to I/O requests as they move through this software stack, the tradeoffs of interfacing at the various layers, and general guidelines for obtaining high performance.  Attendees will come out of the tutorial with an understanding of how HPIO systems are constructed and how they are different from other storage systems, knowledge of what happens in a HPIO system after an application makes an I/O call, a familiarity with the tools available to them for making use of these resources, and some guidelines for extracting performance from these complicated systems.

 


Tutorial Detailed Description

 

Introduction and I/O stacks
·         Application I/O vs. parallel I/O 
·         Bridging the gap with I/O stacks 
·         I/O stacks for computational science 
o  High Level Libraries
o  I/O Middleware
o  Parallel File System
 
I/O interfaces and formats, with examples 
·         POSIX file system interface 
o  Examples
o  POSIX and PFS interaction
·         MPI-IO interface 
o  Collective I/O
o  Noncontiguous I/O
o  Nonblocking and Asynchronous I/O
o  Examples
o  Optimizations
o  MPI-IO Implementations
 
Break
 
·         Parallel netCDF (PnetCDF) 
o  File Layout
o  Usage
o  Examples
o  Interaction with MPI-IO
·         Hierarchical Data Format (HDF5) 
o  File Layout
o  Usage
o  Examples
o  Interaction with MPI-IO
 
I/O best practices 
·         Choosing an I/O interface 
·         Guidelines for I/O performance 
·         Tuning I/O stacks with hints 
o  MPI-IO hints
o  File system-specific hints
o  High-level library hints
·         Enlisting the experts 
 
Conclusions and supplemental material

 


 


Presenters Bio

 

Robert Latham

robl@mcs.anl.gov

 

Research Interests

 

Robert's focus has been on high performance I/O for scientific applications and I/O metrics.  He has worked on the ROMIO MPI-IO implementation, the parallel file systems PVFS and PVFS2, and leads integration and testing for the Parallel NetCDF high-level I/O library.

 

Education

 

  • Masters of Science, Computer Engineering, Lehigh University, 2000
  • Bachelors of Science, Computer Engineering, Lehigh University, 1999

 

Experience

 

  • 2002-Present: Software Developer, Argonne National Laboratory
  • 2000-2002: Systems Engineer, Paralogic, Inc.

 

Selected Publications

 

  • Robert Ross, Robert Latham, William Gropp, Rajeev Thakur, and Brian Toonen, ``Implementing MPI-IO Atomic Mode Without File System Support,'' in Proceedings of the 5th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2005), May 2005.
  • Rob Latham, Rob Ross, and Rajeev Thakur, ``The Impact of File Systems on MPI-IO Scalability,'' in Proceedings of the 11th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2004), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3241, Springer, September 2004, pp. 87--96.
  • Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, Rob Latham, Andrew Siegel, Brad Gallagher, Michael Zingale, ``Parallel netCDF: A High-Performance Scientific I/O Interface,'' Proceedings of SC2003, Phoenix, AZ, November, 2003.

 

 

Selected Presentations

 

  • R. Ross, W. Ligon, R. Latham, and N. Miller, ``PVFS2 Birds of a Feather Session,'' SC2004, Pittsburgh, PA, November 2004.
  • R. Latham and N. Miller, ``PVFS2 Birds of a Feather Session,'' FAST 2004, San Francisco, CA, 2004.
  • R. Ross, W. Ligon, P. Carns, R. Latham, and N. Miller, ``PVFS Birds of a Feather Session,'' SC2003, Phoenix, AZ, November 2003.

Robert B. Ross

rross@mcs.anl.gov

 

Research Interests

 

Robert's research interests are in message passing and storage systems for high performance computing environments, in particular cluster computing environments.  He is the lead architect for the Second Parallel Virtual File System (PVFS2), a parallel file system for large-scale parallel computers.  Current projects include the ROMIO MPI-IO implementation, the Parallel netCDF high-level I/O library, the PVFS2 parallel file system, and the MPICH2 MPI-2 implementation.

 

Education

 

  • Doctor of Philosophy, Computer Engineering, Clemson University, December 2000
  • Bachelor of Science, Computer Engineering, Clemson University, May 1994

 

Selected Publications

 

  • Robert Ross, Robert Latham, William Gropp, Rajeev Thakur, and Brian Toonen, ``Implementing MPI-IO Atomic Mode Without File System Support,'' in Proceedings of the 5th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2005), May 2005.
  • Rob Latham, Rob Ross, and Rajeev Thakur, ``The Impact of File Systems on MPI-IO  Scalability,'' in Proceedings of the 11th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2004), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3241, Springer, September 2004, pp. 87--96.

 

Selected Tutorials

 

  • W. Gropp, E. Lusk, R. Ross, and R. Thakur, ``Advanced MPI: I/O and One-Sided Communication,'' SC2004, Pittsburgh, PA, November, 2004.
  • R. Ross, ``High-Performance I/O for Scientific Applications,'' ClusterWorld 2004, San Jose, CA, April, 2004.
  • R. Ross and R. Thakur, ``Using MPI-2: A Tutorial on Advanced Features of the Message-Passing Interface Standard,'' CCGrid 2004, Chicago, April 2004.
  • W. Gropp, E. Lusk, R. Ross, and R. Thakur, ``Using MPI-2: A Tutorial on Advanced Features of the Message-Passing Interface Standard,'' SC2003, Phoenix, AZ, November, 2003.
  • W. Gropp, E. Lusk, R. Ross, and R. Thakur, ``Using MPI-2: A Tutorial on Advanced Features of the Message-Passing Interface Standard,'' SC2002, Baltimore, MD, November, 2002.