Power, Energy, and Thermal Considerations in {SSD-Based} {I/O} Acceleration

Jie Zhang; Mustafa Shihab; Myoungsoo Jung; Jin-Soo Kim; Seungryoul Maeng; Srinivasan Narayanamurthy; Ranjit Kumar; Siddhartha Nandi

Workshop Program

All sessions will be held in Grand Ballroom D unless otherwise noted.

The full papers published by USENIX for this workshop are available as a download or individually below. Copyright to the individual works is retained by the author(s).

Download Paper Archives

Attendee Files

HotStorage '14 Papers ZIP

Tuesday, June 17, 2014

8:00 a.m.–9:00 a.m.	Tuesday
Continental Breakfast Columbus Foyer
9:00 a.m.–10:00 a.m.	Tuesday
Joint Keynote with HotCloud Columbus Ballroom Cloudy with a Chance of … Michael Franklin, Thomas M. Siebel Professor of Computer Science, University of California, Berkeley Available Media Read more about Cloudy with a Chance of …
10:00 a.m.–10:30 a.m.	Tuesday
Break with Refreshments Columbus Foyer
10:30 a.m.–12:10 p.m.	Tuesday
Joint Session with HotCloud Session Chair: Curt Kolovson, VMware Columbus Ballroom Convergent Dispersal: Toward Storage-Efﬁcient Security in a Cloud-of-Clouds 1:30 pm Mingqiang Li, Chuan Qin, Patrick P. C. Lee, The Chinese University of Hong Kong; Jin Li, Guangzhou University Cloud-of-clouds storage exploits diversity of cloud storage vendors to provide fault tolerance and avoid vendor lock-ins. Its inherent diversity property also enables us to offer keyless data security via dispersal algorithms. However, the keyless security of existing dispersal algorithms relies on the embedded random information, which breaks data deduplication of the dispersed data. To simultaneously enable keyless security and deduplication, we propose a novel dispersal approach called convergent dispersal, which replaces original random information with deterministic cryptographic hash information that is derived from the original data but cannot be inferred by attackers without knowing the whole data. We develop two convergent dispersal algorithms, namely CRSSS and CAONT-RS. Our evaluation shows that CRSSS and CAONT-RS provide complementary performance advantages for different parameter settings. Available Media On the Feasibility of Data Loss Insurance for Personal Cloud Storage Xiaosong Ma, Qatar Computing Research Institute Personal data are important assets that people nowadays entrust with cloud storage services for the convenience of easy, ubiquitous access. To attract/retain customers, cloud storage companies aggressively replicate and georeplicate data. Such replication may be over-cautious for the majority of data objects and contributes to the relatively high price of cloud storage. Yet cloud storage companies are reluctant to provide costumers with any guarantee against permanent data loss. In this paper, we discuss the viability for cloud storage service to provide optional data insurance. We examine major risks associated with cloud storage data loss and derive a crude model for premium calculation. The estimated premium level (per unit declared value) in most scenarios is found significantly smaller than that accepted in mature businesses like shipping. Therefore, optional insurance can potentially provide cloud storage services with more flexibility and cost-effectiveness in resource management, and customers with both peace of mind and lowered cost. Available Media Harmonium: Elastic Cloud Storage via File Motifs Helgi Sigurbjarnarson, Petur Orri Ragnarsson, Ymir Vigfusson, Reykjavik University; Mahesh Balakrishnan, Microsoft Research Modern applications expand to fill the space available to them, exploiting local storage to improve performance by caching, prefetching and precomputing data. In virtualized settings, this behavior compromises storage elasticity owing to a rigid contract between the hypervisor and the guest OS: once space is allocated to a virtual disk and used by an application, it cannot be reclaimed by the hypervisor. In this paper, we propose a new guest filesystem called Harmonium that exploits the ephemeral or derivative nature of application data. Each file in Harmonium optionally has a motif that describes how the file can be reconstructed via computation, network accesses, or operations on other files. Harmonium expands files from their motifs when space is available, and contracts them back to their motifs when it is scarce. Given a target size, the system selects files to expand or contract based on the load on the CPU, network, and storage, as well as expected access patterns. As a result, Harmonium enables elastic cloud storage, allowing the hypervisor to dynamically balance storage across multiple VMs. Available Media Impression Store: Compressive Sensing-based Storage for Big Data Analytics Jiaxing Zhang, Ying Yan, Liang Jeff Chen, Minjie Wang, Thomas Moscibroda, and Zheng Zhang, Microsoft Research For many big data analytics workloads, approximate results suffice. This begs the question, whether and how the underlying system architecture can take advantage of such relaxations, thereby lifting constraints inherent in today’s architectures. This position paper explores one of the possible directions. Impression Store is a distributed storage system with the abstraction of big data vectors. It aggregates updates internally and responds to the retrieval of top-K high-value entries. With proper extension, Impression Store supports various aggregations, top-K queries, outlier and major mode detection. While restricted in scope, such queries represent a substantial and important portion of many production workloads. In return, the system has unparalleled scalability; any node in the system can process any query, both reads and updates. The key technique we leverage is compressive sensing, a technique that substantially reduces the amount of active memory state, IO, and traffic volume needed to achieve such scalability. Available Media
12:10 p.m.–2:00 p.m.	Tuesday
FCW '14 Luncheon Grand Ballroom ABC
2:00 p.m.–3:30 p.m.	Tuesday
Money, Batteries, and Shingles Session Chair: Nitin Agrawal, NEC qNVRAM: quasi Non-Volatile RAM for Low Overhead Persistency Enforcement in Smartphones Hao Luo, Lei Tian and Hong Jiang, University of Nebraska, Lincoln The persistent storage options in smartphones employ journaling or double-write to enforce atomicity, consistency and durability, which introduces significant overhead to system performance. Our in-depth examination of the issue leads us to believe that much of the overhead would be unnecessary if we rethink the volatility of memory considering the battery-backed characteristics of DRAM in modern-day smartphones. With this rethinking, we propose quasi Non-Volatile Memory (qNVRAM), a new design that makes the DRAM in smartphones quasi non-volatile, to help remove the performance overhead of enforcing persistency. We assess the feasibility and effectiveness of our design by implementing a persistent page cache in SQLite. Our evaluation on a real Android smartphone shows that qNVRAM speeds up the insert, update and delete transactions by up to 16:33x, 15:86x and 15:76x respectively. Available Media Novel Address Mappings for Shingled Write Disks Weiping He and David H.C. Du, University of Minnesota, Twin Cities Shingled Write Disks (SWDs) increase the storage density by writing data in overlapping tracks. Consequently, data cannot be updated freely in place without overwriting the valid data in subsequent tracks if any. A write operation therefore may incur several extra read and write operations, which creates a write amplification problem. In this paper, we propose several novel static Logical Block Address (LBA) to Physical Block Address (PBA) mapping schemes for in-place update SWDs which significantly reduce the write amplification. The experiments with four traces demonstrate that our scheme can provide comparable performance to that of regular Hard Disk Drives (HDDs) when the SWD space usage is no more than 75%. Available Media On the Importance of Evaluating Storage Systems’ $Costs 2:00 pm Zhichao Li, Amanpreet Mukker, and Erez Zadok, Stony Brook University Modern storage systems are becoming more complex, combining different storage technologies with different behaviors. Performance alone is not enough to characterize storage systems: energy efficiency, durability, and more are becoming equally important. We posit that one must evaluate storage systems from a monetary cost perspective as well as performance. We believe that cost should consider the workloads used over the storage systems’ expected lifetime. We designed and developed a versatile hybrid storage system under Linux that combines HDD and SSD. The SSD can be used as cache or as primary storage for hot data. Our system includes tunable parameters to enable trading off performance, energy use, and durability. We built a cost model and evaluated our system under a variety of workloads and parameters, to illustrate the importance of cost evaluations of storage systems. Available Media
3:30 p.m.–4:00 p.m.	Tuesday
Break with Refreshments Columbus Foyer
4:00 p.m.–5:15 p.m.	Tuesday
A Brave New World (of Storage System Design) Session Chair: Margo Seltzer, Harvard School of Engineering and Applied Sciences and Oracle Towards High-Performance Application-Level Storage Management Simon Peter, Jialin Li, Doug Woos, Irene Zhang, Dan R. K. Ports, Thomas Anderson, Arvind Krishnamurthy, and Mark Zbikowski, University of Washington We propose a radical re-architecture of the traditional operating system storage stack to move the kernel off the data path. Leveraging virtualized I/O hardware for disk and flash storage, most read and write I/O operations go directly to application code. The kernel dynamically allocates extents, manages the virtual to physical binding, and performs name translation. The benefit is to dramatically reduce the CPU overhead of storage operations while improving application flexibility. Available Media NVMKV: A Scalable and Lightweight Flash Aware Key-Value Store Leonardo Mármol, Florida International University; Swaminathan Sundararaman and Nisha Talagala, FusionIO; Raju Rangaswami, Florida International University; Sushma Devendrappa, Bharath Ramsundar, and Sriram Ganesan, FusionIO State-of-the-art flash-optimized KV stores frequently rely upon a log structure and/or compaction-based strategy to optimally organize content on flash. However, these strategies lead to excessive I/O, beyond the write amplification generated within the flash itself, with both the application and the flash device constantly rearranging data. In this paper, we explore the other extreme in the design space: minimal data management at the KV store and heavy reliance on the Flash Translation Layer (FTL) capabilities. NVMKV is a scalable and lightweight KV store that leverages advanced capabilities that are becoming available in modern FTLs. We demonstrate that NVMKV (i) performs KV operations at close to native device access speeds for get operations, (ii) outperforms state of the art KV stores by 50%-300%, (iii) significantly improves performance predictability for the YCSB KV benchmark when compared with the popular LevelDB KV store, and (iv) reduces data written to flash by as much as 1.7X and 29X for sequential and random write workloads relative to LevelDB, thereby dramatically increasing device lifetime. Available Media FlashQueryFile: Flash-Optimized Layout and Algorithms for Interactive Ad Hoc SQL on Big Data Rini T. Kaushik, IBM Research—Almaden High performance storage layer is vital for allowing interactive ad hoc SQL analytics (OLAP style) over Big Data. The paper makes a case for leveraging flash in the Big Data stack to speed up queries. State-of-the-art Big Data layouts and algorithms are optimized for hard disks (i.e., sequential access is emphasized over random access) and result in suboptimal performance on flash given its drastically different performance characteristics. While existing columnar and row-columnar layouts are able to reduce disk IO compared to row-based layouts, they still end up reading significant columnar data irrelevant to the query as they only employ coarse-grained, intra-columnar data skipping which doesn’t work across all queries. FlashQueryFile’s specialized columnar data layouts, selection, and projection algorithms fully exploit fast random accesses and high internal I/O parallelism of flash to allow fast and I/O-efficient query processing and fine-grained, intra-columnar data skipping to minimize data read per query. FlashQueryFile results in 11X-100X TPC-H query speedup and 38%-99.08% reduction in data read compared to flash-based HDD-optimized row-columnar data layout and its associated algorithms. Available Media
6:00 p.m.–7:00 p.m.	Tuesday
Tuesday Happy Hour Columbus Foyer

Wednesday, June 18, 2014

8:00 a.m.–9:00 a.m.	Wednesday
Continental Breakfast Columbus Foyer
9:00 a.m.–10:30 a.m.	Wednesday
Keynote Address The Application/Storage Interface: After All These Years, We're Still Doing It Wrong Remzi Arpaci-Dusseau, University of Wisconsin—Madison Despite years of experience, countless implementations, and increased importance in the modern era, many basic facets of storage systems remain problematic. In this talk, I'll highlight two fundamental problems found at the interface between applications and storage, and suggest new directions that help bridge the gap between what applications need and what current storage systems provide. Despite years of experience, countless implementations, and increased importance in the modern era, many basic facets of storage systems remain problematic. In this talk, I'll highlight two fundamental problems found at the interface between applications and storage, and suggest new directions that help bridge the gap between what applications need and what current storage systems provide. Remzi Arpaci-Dusseau is a professor of Computer Sciences at the University of Wisconsin-Madison. He and his wife Andrea co-lead a research group that has been active in the systems community for many years; their work has had academic impact (including nine Best Paper awards) and practical impact (including the transaction checksum in Linux ext4 and fast file system checking in FreeBSD). Remzi co-chaired USENIX ATC '04, FAST '07, OSDI '10, and will co-chair SOCC '14. Remzi has won numerous teaching awards and is co-author (with his wife) of a free online operating systems book (available at http://www.ostep.org); chapters of the book have been downloaded over 1 million times in the past few years. Available Media Read more about The Application/Storage Interface: After All These Years, We're Still Doing It Wrong
10:30 a.m.–11:00 a.m.	Wednesday
Break with Refreshments Columbus Foyer
11:00 a.m.–12:15 p.m.	Wednesday
Hotpoouri Session Chair: Doug Santry, NetApp Assert(!Deﬁned(Sequential I/O)) Cheng Li, Rutgers University; Philip Shilane, Fred Douglis, Darren Sawyer, and Hyong Shim, EMC Corporation The term sequential I/O is widely used in systems research with the intuitive understanding that it means consecutive access. From a survey of the literature, though, this intuitive understanding has translated into numerous, inconsistent definitions. Since sequential I/O is such a fundamental concept in systems research, we believe that a sequentiality metric should allow us to compare access patterns in a meaningful way. We explore access properties that could be incorporated into potential metrics for sequential I/O including: access size, gaps between accesses, multi-stream, and inter-arrival time. We then analyze hundreds of largescale storage traces and discuss how potential metrics compare. Interestingly, we find I/O traces considered highly sequential by one metric can be highly random to another metric. We further demonstrate that many plausible metrics are weakly correlated, though metrics weighted by size have more consistency. While there may not be a single metric for sequential I/O that is best in all cases, we believe systems researchers should more carefully consider, and state, which definition they use. Available Media Towards Paravirtualized Network File Systems Raja Appuswamy, Sergey Legtchenko, and Antony Rowstron, Microsoft Research, Cambridge The virtualized storage stack used in enterprise data centers provides two mechanisms to enable virtualized applications to store and retrieve data, namely, virtual disks and network file systems. In this paper, we examine the pros and cons of using these two mechanisms to integrate emerging non-volatile memory devices, and show how neither of them provide low-overhead access to data without sacrificing compatibility with other popular virtualization-enabled features. In doing so, we present paravirtualized NFS, an alternate mechanism for accessing data, highlight its benefits, and outline research challenges involved in realizing it in practice. Available Media Evaluation of Codes with Inherent Double Replication for Hadoop M. Nikhil Krishnan, N. Prakash, V. Lalitha, Birenjith Sasidharan, P. Vijay Kumar, Indian Institute of Science, Bangalore; Srinivasan Narayanamurthy, Ranjit Kumar, and Siddhartha Nandi, NetApp Inc. In this paper, we evaluate the efficacy, in a Hadoop setting, of two coding schemes, both possessing an inherent double replication of data. The two coding schemes belong to the class of regenerating and locally regenerating codes respectively, and these two classes are representative of recent advances made in designing codes for the efficient storage of data in a distributed setting. In comparison with triple replication, double replication permits a significant reduction in storage overhead, while delivering good MapReduce performance under moderate work loads. The two coding solutions under evaluation here, add only moderately to the storage overhead of double replication, while simultaneously offering reliability levels similar to that of triple replication. One might expect from the property of inherent data duplication that the performance of these codes in executing a MapReduce job would be comparable to that of double replication. However, a second feature of this class of code comes into play here, namely that under both coding schemes analyzed here, multiple blocks from the same coded stripe are required to be stored on the same node. This concentration of data belonging to a single stripe negatively impacts MapReduce execution times. However, much of this effect can be undone by simply adding a larger number of processors per node. Further improvements are possible if one tailors the Map task scheduler to the codes under consideration. We present both experimental and simulation results that validate these observations. Available Media
12:15 p.m.–2:00 p.m.	Wednesday
FCW '14 Luncheon Columbus Foyer
2:00 p.m.–3:15 p.m.	Wednesday
SSDelightful Session Chair: Geoff Kuenning, Harvey Mudd College The Multi-streamed Solid-State Drive Jeong-Uk Kang, Jeeseok Hyun, Hyunjoo Maeng, and Sangyeun Cho, Samsung Electronics Co. This paper makes a case for the multi-streamed solidstate drive (SSD). It offers an intuitive storage interface for the host system to inform the SSD about the expected lifetime of data being written. We show through experimentation with a real multi-streamed SSD prototype that the worst-case update throughput of a Cassandra NoSQL DB system can be improved by nearly 56%. We discuss powerful use cases of the proposed SSD interface. Available Media Accelerating External Sorting via On-the-ﬂy Data Merge in Active SSDs 2:00 pm Young-Sik Lee, Korea Advanced Institute of Science and Technology (KAIST); Luis Cavazos Quero, Youngjae Lee, and Jin-Soo Kim, Sungkyunkwan University; Seungryoul Maeng, Korea Advanced Institute of Science and Technology (KAIST) The concept of active SSDs (solid state drives) has been introduced in order to cope with the demands required to process the ever-increasing volumes of data. In active SSDs, some of the data-processing tasks are offloaded to SSDs, freeing host system resources and improving overall performance of data analysis. In this paper, we propose a novel active SSD architec- ture focused on improving the external sorting algorithm that is used extensively in data-intensive computing. By performing merge operations on-the-fly in active SSDs, our method can remove the extra data transfer and en- hance the lifetime of SSDs. Our evaluation results on a real SSD platform indicate that the proposed scheme out- performs the traditional external sorting by up to 39%. Available Media Power, Energy, and Thermal Considerations in SSD-Based I/O Acceleration Jie Zhang, Mustafa Shihab and Myoungsoo Jung, The University of Texas at Dallas Solid State Disks (SSDs) have risen to prominence as an I/O accelerator with low power consumption and high energy efficiency. In this paper, we question some common assumptions regarding SSDs’ operating temperature, dynamic power, and energy consumption through extensive empirical analysis. We examine three different real high-end SSDs that respectively employ multiple channels, cores, and flash chips. Our evaluations reveal that dynamic power consumption of many-resource SSD is, on average, 5x and 4x worse than an enterprise-scale SSD and HDD, respectively. This work also addresses SSD overheating problem and power throttling issues, which result in significant performance degradation. Lastly, we offer an evidence that HW/SW optimization studies are needed to improve energy efficiency in future SSDs. Available Media
3:15 p.m.–3:30 p.m.	Wednesday
Break with Refreshments Columbus Foyer
3:30 p.m.–6:00 p.m.	Wednesday
Informal Discussion Riverview B
6:30 p.m.–8:00 p.m.	Wednesday
Wednesday Reception Grand Ballroom AB

Workshop Program

Tuesday, June 17, 2014

Continental Breakfast

Break with Refreshments

FCW '14 Luncheon

Break with Refreshments

Tuesday Happy Hour

Wednesday, June 18, 2014

Continental Breakfast

Break with Refreshments

FCW '14 Luncheon

Break with Refreshments

Wednesday Reception