sponsors
help promote
usenix conference policies
You are here
Beehive: Erasure Codes for Fixing Multiple Failures in Distributed Storage Systems
Jun Li and Baochun Li, University of Toronto
Distributed storage systems have been increasingly deploying erasure codes (such as Reed-Solomon codes) for fault tolerance. Though Reed-Solomon codes require much less storage space than replication, a significant amount of network transfer and disk I/O will be imposed when fixing unavailable data by reconstruction. Traditionally, it is expected that unavailable data are fixed separately. However, since it is observed that failures in the data center are correlated, fixing unavailable data of multiple failures is both unavoidable and even common. In this paper, we show that reconstructing data of multiple failures in batches can cost significantly less network transfer and disk I/O than fixing them separately. We present Beehive, a new design of erasure codes, that can fix unavailable data of multiple failures in batches while consuming the optimal network transfer with nearly optimal storage overhead. Evaluation results show that Beehive codes can save network transfer by up to 69:4% and disk I/O by 75% during reconstruction.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Jun Li and Baochun Li},
title = {Beehive: Erasure Codes for Fixing Multiple Failures in Distributed Storage Systems},
booktitle = {7th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 15)},
year = {2015},
address = {Santa Clara, CA},
url = {https://www.usenix.org/conference/hotstorage15/workshop-program/presentation/li},
publisher = {USENIX Association},
month = jul
}
connect with us