Revisiting Network Coding for Warm Blob Storage

Authors: 

Chuang Gan, Huazhong University of Science and Technology; Yuchong Hu, Huazhong University of Science and Technology and Shenzhen Huazhong University of Science and Technology Research Institute; Leyan Zhao, Xin Zhao, Pengyu Gong, and Dan Feng, Huazhong University of Science and Technology

Abstract: 

Minimum-storage regenerating (MSR) codes are repair-optimal erasure codes that minimize the bandwidth for repairing a failed node, while minimizing the storage redundancy necessary for fault tolerance. Recent studies in the literature, both from coding theory and systems communities, mainly examine MSR codes in systematic form, which keeps the original data blocks as part of the encoded blocks for direct access. However, systematic MSR codes manage encoded blocks at the sub-block granularity and access non-contiguous sub-blocks during repairs to achieve bandwidth optimality. Thus, their actual repair performance is impaired by non-contiguous I/Os, especially when the block size is small. In this paper, we explore how non-systematic MSR codes, which generate purely coded blocks based on random linear coding in the classical network coding theory, can improve I/O efficiency in repair for practical warm blob (binary large object) storage systems that are dominated by a large fraction of small blobs. To this end, we design NCBlob, a network-coding-based warm blob storage system that encodes small blobs non-systematic MSR codes to achieve high repair performance, while leveraging the access locality of small blobs to maintain high normal read performance. Experiments on Alibaba Cloud show that NCBlob reduces the single-block repair time by up to 45.0%, and the full-node repair time by up to 38.4%, with as low as 2.1% read throughput loss, compared with state-of-the-art systematic MSR codes.

FAST '25 Open Access Sponsored by
NetApp

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

This content is available to:

BibTeX
@inproceedings {305210,
author = {Chuang Gan and Yuchong Hu and Leyan Zhao and Xin Zhao and Pengyu Gong and Dan Feng},
title = {Revisiting Network Coding for Warm Blob Storage},
booktitle = {23rd USENIX Conference on File and Storage Technologies (FAST 25)},
year = {2025},
isbn = {978-1-939133-45-8},
address = {Santa Clara, CA},
pages = {139--154},
url = {https://www.usenix.org/conference/fast25/presentation/gan},
publisher = {USENIX Association},
month = feb
}

Presentation Video