EC-Cache: Load-Balanced, Low-Latency Cluster Caching with Online Erasure Coding

Authors: 

K. V. Rashmi, University of California, Berkeley; Mosharaf Chowdhury and Jack Kosaian, University of Michigan; Ion Stoica and Kannan Ramchandran, University of California, Berkeley

Abstract: 

Data-intensive clusters and object stores are increasingly relying on in-memory object caching to meet the I/O performance demands. These systems routinely face the challenges of popularity skew, background load imbalance, and server failures, which result in severe load imbalance across servers and degraded I/O performance. Selective replication is a commonly used technique to tackle these challenges, where the number of cached replicas of an object is proportional to its popularity. In this paper, we explore an alternative approach using erasure coding.

EC-Cache is a load-balanced, low latency cluster cache that uses online erasure coding to overcome the limitations of selective replication. EC-Cache employs erasure coding by: (i) splitting and erasure coding individual objects during writes, and (ii) late binding, wherein obtaining any k out of (k + r) splits of an object are sufficient, during reads. As compared to selective replication, EC-Cache improves load balancing by more than 3x and reduces the median and tail read latencies by more than 2x, while using the same amount of memory. EC-Cache does so using 10% additional bandwidth and a small increase in the amount of stored metadata. The benefits offered by EC-Cache are further amplified in the presence of background network load imbalance and server failures.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {199354,
author = {K. V. Rashmi and Mosharaf Chowdhury and Jack Kosaian and Ion Stoica and Kannan Ramchandran},
title = {{EC-Cache}: {Load-Balanced}, {Low-Latency} Cluster Caching with Online Erasure Coding},
booktitle = {12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)},
year = {2016},
isbn = {978-1-931971-33-1},
address = {Savannah, GA},
pages = {401--417},
url = {https://www.usenix.org/conference/osdi16/technical-sessions/presentation/rashmi},
publisher = {USENIX Association},
month = nov
}

Presentation Audio