sponsors
usenix conference policies
You are here
Migratory Compression: Coarse-grained Data Reordering to Improve Compressibility
Xing Lin, University of Utah; Guanlin Lu, Fred Douglis, Philip Shilane, and Grant Wallace, EMC Corporation—Data Protection and Availability Division
We propose Migratory Compression (MC), a coarse-grained data transformation, to improve the effectiveness of traditional compressors in modern storage systems. In MC, similar data chunks are re-located together, to improve compression factors. After decompression, migrated chunks return to their previous locations. We evaluate the compression effectiveness and overhead of MC, explore reorganization approaches on a variety of datasets, and present a prototype implementation of MC in a commercial deduplicating file system. We also compare MC to the more established technique of delta compression, which is significantly more complex to implement within file systems.
We find that Migratory Compression improves compression effectiveness compared to traditional compressors, by 11% to 105%, with relatively low impact on runtime performance. Frequently, adding MC to a relatively fast compressor like gzip results in compression that is more effective in both space and runtime than slower alternatives. In archival migration, MC improves gzip compression by 44–157%. Most importantly, MC can be implemented in broadly used, modern file systems.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Xing Lin and Guanlin Lu and Fred Douglis and Philip Shilane and Grant Wallace},
title = {Migratory Compression: Coarse-grained Data Reordering to Improve Compressibility},
booktitle = {12th USENIX Conference on File and Storage Technologies (FAST 14)},
year = {2014},
address = {Santa Clara, CA},
pages = {256--273},
url = {https://www.usenix.org/conference/fast14/technical-sessions/presentation/lin},
publisher = {USENIX Association},
month = feb
}
connect with us