Understanding Large-Scale Storage Systems

Grand Ballroom ABCD

Half Day Morning
9:00 am12:30 pm
Description: 

This tutorial is oriented toward administrators and developers who manage and use large-scale storage systems. An important goal of the tutorial is to give the audience the foundation for effectively comparing different storage system options, as well as a better understanding of the systems they already have.

Cluster-based parallel storage technologies are used to manage millions of files, thousands of concurrent jobs, and performance that scales from 10s to 100s of GB/sec. This tutorial will examine current state-of-the-art high-performance file systems and the underlying technologies employed to deliver scalable performance across a range of scientific and industrial applications.

The tutorial starts with a look at storage devices and SSDs in particular, which are growing in importance in all storage systems. Next we look at how a file system is put together, comparing and contrasting SAN file systems, scale-out NAS, object-based parallel file systems, and cloud-based storage systems.

Topics include: 
  • SSD technology
  • Scaling the data path
  • Scaling metadata
  • Fault tolerance
  • Manageability
  • Cloud storage

Specific systems are discussed, including Ceph, Lustre, GPFS, PanFS, HDFS (Hadoop File System), BigTable, LevelDB, and Google's Colossus File System

Presentation Type: 
Training