sponsors
usenix conference policies
You are here
Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information
Min Fu, Dan Feng, and Yu Hua, Huazhong University of Science and Technology; Xubin He, Virginia Commonwealth University; Zuoning Chen, National Engineering Research Center for Parallel Computer; Wen Xia, Fangting Huang, and Qing Liu, Huazhong University of Science and Technology
In deduplication-based backup systems, the chunks of each backup are physically scattered after deduplication, which causes a challenging fragmentation problem. The fragmentation decreases restore performance, and results in invalid chunks becoming physically scattered in different containers after users delete backups. Existing solutions attempt to rewrite duplicate but fragmented chunks to improve the restore performance, and reclaim invalid chunks by identifying and merging valid but fragmented chunks into new containers. However, they cannot accurately identify fragmented chunks due to their limited rewrite buffer. Moreover, the identification of valid chunks is cumbersome and the merging operation is the most time-consuming phase in garbage collection.
Our key observation that fragmented chunks remain fragmented in subsequent backups motivates us to pro- pose a History-Aware Rewriting algorithm (HAR). HAR exploits historical information of backup systems to more accurately identify and rewrite fragmented chunks. Since the valid chunks are aggregated in compact containers by HAR, the merging operation is no longer required. To reduce the metadata overhead of the garbage collection, we further propose a Container-Marker Algorithm (CMA) to identify valid containers instead of valid chunks. Our extensive experimental results from real-world datasets show HAR significantly improves the restore performance by 2:6X–17X at a cost of only rewriting 0:45–1:99% data. CMA reduces the metadata overhead for the garbage collection by about 90X.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Min Fu and Dan Feng and Yu Hua and Xubin He and Zuoning Chen and Wen Xia and Fangting Huang and Qing Liu},
title = {Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information},
booktitle = {2014 USENIX Annual Technical Conference (USENIX ATC 14)},
year = {2014},
isbn = {978-1-931971-10-2},
address = {Philadelphia, PA},
pages = {181--192},
url = {https://www.usenix.org/conference/atc14/technical-sessions/presentation/fu_min},
publisher = {USENIX Association},
month = jun
}
connect with us