sponsors
help promote
usenix conference policies
Accordion: Multi-Scale Recipes for Adaptive Detection of Duplication
Russell Lewis and John H. Hartman, University of Arizona
A recipe is metadata that describes the contents of a file as a sequence of blocks identified by their hash. Using recipes, one can rapidly compare the contents of two files without reading the files themselves. Unfortunately, recipes present a space/precision tradeoff: small block sizes will maximize the duplication that is discoverable, but large block sizes produce small recipes that can be compared more quickly. In this paper, we present Accordion, a toolset for the creation and use of multi-scale recipes—that is, recipes that include blocks at several different scales. We demonstrate two duplication-detection algorithms—one optimized for situations where lots of duplication is expected, and another for those where the existence of duplication is uncertain.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Russell Lewis and John H. Hartman},
title = {Accordion: {Multi-Scale} Recipes for Adaptive Detection of Duplication},
booktitle = {7th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 15)},
year = {2015},
address = {Santa Clara, CA},
url = {https://www.usenix.org/conference/hotstorage15/workshop-program/presentation/lewis},
publisher = {USENIX Association},
month = jul
}
connect with us