8:00 am–9:00 am |
Monday |
Continental Breakfast
Mezzanine East/West
|
9:00 am–9:15 am |
Monday |
Program Co-Chairs: Ken Salem, University of Waterloo, and John Strunk,NetApp
|
9:15 am–10:45 am |
Monday |
Session Chair: Daniel Ellard, Raytheon BBN Technologies
Erez Zadok, Aashray Arora, Zhen Cao, Akhilesh Chaganti, Arvind Chaudhary, and Sonam Mandal, Stony Brook University Most storage systems come with large set of parameters to directly or indirectly control a specific set of metrics that may include performance, energy, etc. Often, storage systems are deployed with default configurations, rendering them sub-optimal. Finding optimal configurations is difficult due to the numerous combinations of parameters and parameter sensitivity to workloads and deployed environments. Previous research on parameter optimization was either limited to narrow problems or not widely applicable to storage stack parameter optimization in general. Based on promising early results, we propose using meta-heuristic techniques such as genetic algorithms to efficiently find nearoptimal configurations for storage systems.
Sharath Chandrashekhara, Kyle Marcus, Rakesh G. M. Subramanya, Hrishikesh S. Karve, Karthik Dantu, and Steven Y. Ko, SUNY-Buffalo Today’s mobile apps often leverage cloud services to manage their own data as well as user data, enabling many desired features such as backup and sharing. However, this comes at a cost; developers have to manually craft their logic and potentially repeat a similar process for different cloud providers. In addition, users are restricted to the design choices made by developers; for example, once a developer releases an app that uses a particular cloud service, it is impossible for a user to later customize the app and choose a different service.
In this paper, we explore the design space of an app instrumentation tool that automatically integrates cloud storage services for Android apps. Our goal is to allow developers to treat all storage operations as local operations, and automatically enable cloud features customized for individual needs of users and developers. We discuss various scenarios that can benefit from such an automated tool, challenges associated with the development of it, and our ideas to address these challenges.
Gala Yadgar, Roman Shor, Eitan Yaakobi, and Assaf Schuster, Technion—Israel Institute of Technology Modern flash devices, which perform updates ‘out of place’, require different optimization strategies than hard disks. The focus for flash devices is on optimizing data movement, rather than optimizing data placement. An understanding of the processes that cause data movement within a flash drive is crucial for analyzing and managing it.
While sequentiality on hard drives is easy to visualize, as is done by various defragmentation tools, data movement on flash is inherently dynamic. With the lack of suitable visualization tools, researchers and developers must rely on aggregated statistics and histograms from which the actual movement is derived. The complexity of this task increases with the complexity of state-of-the-art FTL production and research optimizations.
Adding visualization to existing research and analysis tools will greatly improve our understanding of modern, complex flash-based systems. We developed SSDPlayer, a graphical tool for visualizing the various processes that cause data movement on SSDs. We use SSDPlayer to demonstrate how visualization can help us shed light on the complex phenomena that cause data movement and expose new opportunities for optimization.
Xianzheng Dou, Jason Flinn, and Peter M. Chen, University of Michigan We propose a new point in the design space of versioning and provenance-aware file systems in which the entire operating system, not just the file system, supports such functionality. We leverage deterministic recordand- replay to substitute computation for data. This leads to a new file system design where the log of nondeterministic inputs, not file data, is the fundamental unit of persistent storage. We outline a distributed storage system design based on these principles and describe the challenges we foresee for achieving our vision.
|
10:45 am–11:15 am |
Monday |
Break with Refreshments
Mezzanine East/West
|
11:15 am–12:30 pm |
Monday |
|
12:30 pm–2:15 pm |
Monday |
Luncheon for Workshop Attendees
Terra Courtyard
|
2:15 pm–3:30 pm |
Monday |
Session Chair: Nisha Talagala, SanDisk
Xuebin Zhang, Jiangpeng Li, Kai Zhao, Hao Wang, and Tong Zhang, Rensselaer Polytechnic Institute This paper presents a method to implement delta compression for metadata storage in flash memory. With the abundant temporal redundancy in metadata, it is very intuitive to expect flash-based metadata storage can significantly benefit from delta compression. However, straightforward realization of delta compression demands the storage of the original data and the deltas among different versions in different flash memory physical pages, which leads to significant overhead in terms of read/write latency and data management complexity. Through experiments with 20nm NAND flash memory chips, we observed that, when operating in SLC mode, flash memory page can be programmed in a progressive manner, i.e., different portion of the same SLC flash memory page can be programmed at different time. This motivates us to propose a simple design approach that can realize delta compression for metadata storage without latency and data management complexity overheads. The key idea is to allocate SLC-mode flash memory pages for metadata, and store the original data and all the subsequent deltas in the same physical page through progressive programming. Experimental results show that this approach can significantly reduce the metadata write traffic without any latency overhead.
Jun Li and Baochun Li, University of Toronto Distributed storage systems have been increasingly deploying erasure codes (such as Reed-Solomon codes) for fault tolerance. Though Reed-Solomon codes require much less storage space than replication, a significant amount of network transfer and disk I/O will be imposed when fixing unavailable data by reconstruction. Traditionally, it is expected that unavailable data are fixed separately. However, since it is observed that failures in the data center are correlated, fixing unavailable data of multiple failures is both unavoidable and even common. In this paper, we show that reconstructing data of multiple failures in batches can cost significantly less network transfer and disk I/O than fixing them separately. We present Beehive, a new design of erasure codes, that can fix unavailable data of multiple failures in batches while consuming the optimal network transfer with nearly optimal storage overhead. Evaluation results show that Beehive codes can save network transfer by up to 69:4% and disk I/O by 75% during reconstruction.
Wen Xia and Chunguang Li, Huazhong University of Science and Technology; Hong Jiang, University of Nebraska–Lincoln; Dan Feng, Yu Hua, Leihua Qin, and Yucheng Zhang, Huazhong University of Science and Technology Delta compression, a promising data reduction approach capable of finding the small differences (i.e., delta) among very similar files and chunks, is widely used for optimizing replicate synchronization, backup/archival storage, cache compression, etc. However, delta compression is costly because of its time-consuming word-matching operations for delta calculation. Our in-depth examination suggests that there exists strong word-content locality for delta compression, which means that contiguous duplicate words appear in approximately the same order in their similar versions. This observation motivates us to propose Edelta, a fast delta compression approach based on a word-enlarging process that exploits word-content locality. Specifically, Edelta will first tentatively find a matched (duplicate) word, and then greedily stretch the matched word boundary to find a likely much longer (enlarged) duplicate word. Hence, Edelta effectively reduces a potentially large number of the traditional time-consuming word-matching operations to a single word-enlarging operation, which significantly accelerates the delta compression process. Our evaluation based on two case studies shows that Edelta achieves an encoding speedup of 3X10X over the state-of-the-art Ddelta, Xdelta, and Zdelta approaches without noticeably sacrificing the compression ratio.
|
3:30 pm–4:00 pm |
Monday |
Break with Refreshments
Mezzanine East/West
|
4:00 pm–5:15 pm |
Monday |
Session Chair: Raju Rangaswami, Florida International University
Michaela Blott, Ling Liu, and Kimon Karras, Xilinx Research, Dublin; Kees Vissers, Xilinx Research, San Jose Current web infrastructure relies increasingly on distributed in-memory key-value stores such as memcached whereby typical x86-based implementations of TCP/IP compliant memcached yield limited performance scalability. FPGA-based data-flow architectures overcome and exceed every other published and fully compliant implementation in regards to throughput and provide scalability to 80Gbps, while offering much higher power efficiency and lower latency. However, value store capacity remains limited given the DRAM support in today’s devices.
In this paper, we present and quantify novel hybrid memory systems that combine conventional DRAMs and serial-attached flash to increase value store capacity to 40Terabytes with up to 200 million entries while providing access at 80Gbps. This is achieved by an object distribution based on size using different storage devices over DRAM and flash and data-flow based architectures using customized memory controllers that compensate for large variations in access latencies and bandwidths. We present measured experimental proofpoints, mathematically validate these concepts for published value size distributions from Facebook, Wikipedia, Twitter and Flickr and compare to existing solutions.
John Bent, EMC Corporation; Brad Settlemyer and Nathan DeBardeleben, Los Alamos National Laboratory; Sorin Faibish, Uday Gupta, Dennis Ting, and Percy Tzelnic, EMC Corporation For many emerging and existing architectures, NAND flash is the storage media used to fill the cost-performance gap between DRAM and spinning disk. However, while NAND flash is the best of the available options, for many workloads its specific design choices and trade-offs are not wholly suitable. One such workload is long-running scientific applications which use checkpoint-restart for failure recovery. For these workloads, HPC data centers are deploying NAND flash as a storage acceleration tier, commonly called burst buffers, to provide high levels of write bandwidth for checkpoint storage. In this paper, we compare the costs of adding reliability to such a layer versus the benefits of not doing so. We find that, even though NAND flash is non-volatile, HPC burst buffers should not be reliable when the performance overhead of adding reliability is greater than 2%.
Hugh Greenberg, Los Alamos National Laboratory; John Bent, EMC Corporation; Gary Grider, Los Alamos National Laboratory The long-expected convergence of High Performance Computing and Big Data Analytics is upon us. Unfortunately, the computing environments created for each workload are not necessarily conducive for the other. In this paper, we evaluate the ability of traditional high performance computing architectures to run big data analytics. We discover and describe limitations which prevent the seamless utilization of existing big data analytics tools and software. Specifically, we evaluate the effectiveness of distributed key-value stores for manipulating large data sets across tightly coupled parallel supercomputers. Although existing distributed key-value stores have proven highly effective in cloud environments, we find their performance on HPC clusters to be degraded. Accordingly, we have built an HPC specific key-value stored called the Multi-Dimensional Hierarchical Indexing Middleware (MDHIM). Using standard big data benchmarks we find that MDHIM performance more than triples that of Cassandra on HPC systems.
|
6:00 pm–7:00 pm |
Monday |
Joint Poster Session and Happy Hour with HotCloud
Mezzanine East/West
|