sponsors
usenix conference policies
Primary Data Deduplication—Large Scale Study and System Design
12 Tuesday | 13 Wednesday | 14 Thursday | 15 Friday |
---|---|---|---|
HotCloud '12 | TaPP '12 | ||
WiAC '12 | USENIX ATC '12 | ||
UCMS '12 | HotStorage '12 | NSDR '12 | |
USENIX Cyberlaw '12 | WebApps '12 |
Ahmed El-Shimi, Ran Kalach, Ankit Kumar, Adi Oltean, Jin Li, and Sudipta Sengupta, Microsoft Corporation
We present a large scale study of primary data deduplication and use the findings to drive the design of a new primary data deduplication system implemented in the Windows Server 2012 operating system. File data was analyzed from 15 globally distributed file servers hosting data for over 2000 users in a large multinational corporation.
The findings are used to arrive at a chunking and compression approach which maximizes deduplication savings while minimizing the generated metadata and producing a uniform chunk size distribution. Scaling of deduplication processing with data size is achieved using a RAM frugal chunk hash index and data partitioning – so that memory, CPU, and disk seek resources remain available to fulfill the primary workload of serving IO.
We present the architecture of a new primary data deduplication system and evaluate the deduplication performance and chunking aspects of the system.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Ahmed El-Shimi and Ran Kalach and Ankit Kumar and Adi Ottean and Jin Li and Sudipta Sengupta},
title = {Primary Data {Deduplication{\textemdash}Large} Scale Study and System Design},
booktitle = {2012 USENIX Annual Technical Conference (USENIX ATC 12)},
year = {2012},
isbn = {978-931971-93-5},
address = {Boston, MA},
pages = {285--296},
url = {https://www.usenix.org/conference/atc12/technical-sessions/presentation/el-shimi},
publisher = {USENIX Association},
month = jun
}
connect with us