Ignacio Cano, University of Washington; Srinivas Aiyar, Varun Arora, Manosiz Bhattacharyya, Akhilesh Chaganti, Chern Cheah, Brent Chun, Karan Gupta, and Vinayak Khot, Nutanix Inc.; Arvind Krishnamurthy, University of Washington
Modern cluster storage systems perform a variety of background tasks to improve the performance, availability, durability, and cost-efficiency of stored data. For example, cleaners compact fragmented data to generate long sequential runs, tiering services automatically migrate data between solid-state and hard disk drives based on usage, recovery mechanisms replicate data to improve availability and durability in the face of failures, cost saving techniques perform data transformations to reduce the storage costs, and so on.
In this work, we present Curator, a background MapReduce-style execution framework for cluster management tasks, in the context of a distributed storage system used in enterprise clusters. We describe Curator’s design and implementation, and evaluate its performance using a handful of relevant metrics. We further report experiences and lessons learned from its five-year construction period, as well as thousands of customer deployments. Finally, we propose a machine learning-based model to identify an efficient execution policy for Curator’s management tasks that can adapt to varying workload characteristics.
NSDI '17 Open Access Videos Sponsored by
King Abdullah University of Science and Technology (KAUST)
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Ignacio Cano and Srinivas Aiyar and Varun Arora and Manosiz Bhattacharyya and Akhilesh Chaganti and Chern Cheah and Brent Chun and Karan Gupta and Vinayak Khot and Arvind Krishnamurthy},
title = {Curator: {Self-Managing} Storage for Enterprise Clusters},
booktitle = {14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17)},
year = {2017},
isbn = {978-1-931971-37-9},
address = {Boston, MA},
pages = {51--66},
url = {https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/cano},
publisher = {USENIX Association},
month = mar
}