From Documentation
Jump to: navigation, search
(Created page with "With big datasets, disk I/O on large parallel systems can pose a significant bottleneck compared to the rest of your code's workflow. Parallel filesystems have been optimized to ...")
 
Line 1: Line 1:
 
With big datasets, disk I/O on large parallel systems can pose a significant bottleneck compared to the rest of your code's workflow. Parallel filesystems have been optimized to support large efficient I/O simultaneously by multiple users on multiple nodes, however, contrary to popular thinking, they do not provide "supercomputing" disk performance. In this introductory webinar I will talk about the basics of parallel filesystems, techniques to optimize your storage, as well as various methods to organize parallel disk I/O. Due to the lack of time, I will not go into the details of all possible methods, but will give several examples of parallel I/O using MPI-IO (part of MPI2), and then will briefly talk about the strengths and limitations of most popular higher-level parallel I/O libraries (HDF5, NetCDF, and ADIOS).
 
With big datasets, disk I/O on large parallel systems can pose a significant bottleneck compared to the rest of your code's workflow. Parallel filesystems have been optimized to support large efficient I/O simultaneously by multiple users on multiple nodes, however, contrary to popular thinking, they do not provide "supercomputing" disk performance. In this introductory webinar I will talk about the basics of parallel filesystems, techniques to optimize your storage, as well as various methods to organize parallel disk I/O. Due to the lack of time, I will not go into the details of all possible methods, but will give several examples of parallel I/O using MPI-IO (part of MPI2), and then will briefly talk about the strengths and limitations of most popular higher-level parallel I/O libraries (HDF5, NetCDF, and ADIOS).
 +
<!--checked2015-->

Revision as of 11:30, 9 November 2015

With big datasets, disk I/O on large parallel systems can pose a significant bottleneck compared to the rest of your code's workflow. Parallel filesystems have been optimized to support large efficient I/O simultaneously by multiple users on multiple nodes, however, contrary to popular thinking, they do not provide "supercomputing" disk performance. In this introductory webinar I will talk about the basics of parallel filesystems, techniques to optimize your storage, as well as various methods to organize parallel disk I/O. Due to the lack of time, I will not go into the details of all possible methods, but will give several examples of parallel I/O using MPI-IO (part of MPI2), and then will briefly talk about the strengths and limitations of most popular higher-level parallel I/O libraries (HDF5, NetCDF, and ADIOS).