Don't Maintain Twice, It's Alright: Merged Metadata Management in Deduplication File System with GogetaFS

Authors: 

Yanqi Pan and Wen Xia, Harbin Institute of Technology, Shenzhen; Erci Xu, Alibaba Group; Hao Huang, Xiangyu Zou, and Shiyi Li, Harbin Institute of Technology, Shenzhen

Distinguished Artifact Award Winner

Abstract: 

Emerging storage technologies, such as persistent memory and ultra-low latency SSD, enable the deduplication file system (DedupFS) to use non-cryptographic hash for fast fingerprinting. However, we find that the accelerated computation exposes another major performance penalty: the seemingly innocuous in-storage deduplication metadata maintenance incurs up to 38% overhead in the I/O path.

We find the root cause is that DedupFSes maintain dedup-specific fingerprint-to-physical mappings, which incurs additional crash consistency overheads. However, this overhead is unnecessary. Our insight is that the deduplication mapping can be merged with the file system logical-to-physical mapping, forming a logical-fingerprint-physical (LFP) mapping. Thus, we can persist deduplication metadata alongside file system metadata in a single I/O. We propose GOGETAFS to realize the efficiency of LFP. Using a series of techniques to manage data and metadata atop LFP, GOGETAFS achieves compatible, effective, and memory-efficient deduplication within the file system. Experiments across a range of workloads show that GOGETAFS consistently outperforms existing DedupFSes and can minimize metadata maintenance overheads.

FAST '25 Open Access Sponsored by
NetApp

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

This content is available to:

BibTeX
@inproceedings {305252,
author = {Yanqi Pan and Wen Xia and Erci Xu and Hao Huang and Xiangyu Zou and Shiyi Li},
title = {Don{\textquoteright}t Maintain Twice, It{\textquoteright}s Alright: Merged Metadata Management in Deduplication File System with {GogetaFS}},
booktitle = {23rd USENIX Conference on File and Storage Technologies (FAST 25)},
year = {2025},
isbn = {978-1-939133-45-8},
address = {Santa Clara, CA},
pages = {479--495},
url = {https://www.usenix.org/conference/fast25/presentation/pan},
publisher = {USENIX Association},
month = feb
}

Presentation Video