Grand Ballroom ABCD
This tutorial is oriented toward administrators and developers who manage and use large-scale storage systems. An important goal of the tutorial is to give the audience the foundation for effectively comparing different storage system options, as well as a better understanding of the systems they already have.
Cluster-based parallel storage technologies are used to manage millions of files, thousands of concurrent jobs, and performance that scales from 10s to 100s of GB/sec. This tutorial will examine current state-of-the-art high-performance file systems and the underlying technologies employed to deliver scalable performance across a range of scientific and industrial applications.
The tutorial starts with a look at storage devices and SSDs in particular, which are growing in importance in all storage systems. Next we look at how a file system is put together, comparing and contrasting SAN file systems, scale-out NAS, object-based parallel file systems, and cloud-based storage systems.
- SSD technology
- Scaling the data path
- Scaling metadata
- Fault tolerance
- Manageability
- Cloud storage
Specific systems are discussed, including Ceph, Lustre, GPFS, PanFS, HDFS (Hadoop File System), BigTable, LevelDB, and Google's Colossus File System