sponsors
help promote
usenix conference policies
You are here
SEeSAW - Similarity Exploiting Storage for Accelerating Analytics Workflows
Kalapriya Kannan, Suparna Bhattacharya, Kumar Raj, Muthukumar Murugan, and Doug Voigt, Hewlett Packard Enterprise
The key to successful deployment of big data solutions lies in the timely distillation of meaningful information. This is made difficult by the mismatch between volume and velocity of data at scale and challenges posed by disparate speeds of IO, CPU, memory and communication links of data storage and processing systems. Instead of viewing storage as a bottleneck in this pipeline, we believe that storage systems are best positioned to discover and exploit intrinsic data properties to enhance information density of stored data. This has the potential to reduce the amount of new information that needs to be processed by an analytics workflow. Towards exploring this possibility, we propose SEeSAW, a Similarity Exploiting Storage for Accelerating Analytics Workflows that makes similarity a fundamental storage primitive. We show that SEeSAW transparently eliminates the need for applications to process uninformative data, thereby incurring substantially lower costs on IO, memory, computation and communication while speeding up (about 97% as observed in our experiment) the rate at which actionable outcomes can be derived by analyzing data. By increasing capacity of analytics workloads to absorb more data within the same resource envelope, SEeSAW can open up rich opportunities to reap greater benefits from machine and human generated data accumulated from various sources.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Kalapriya Kannan and Suparna Bhattacharya and Kumar Raj and Muthukumar Murugan and Doug Voigt},
title = {{SEeSAW} - Similarity Exploiting Storage for Accelerating Analytics Workflows},
booktitle = {8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 16)},
year = {2016},
address = {Denver, CO},
url = {https://www.usenix.org/conference/hotstorage16/workshop-program/presentation/kannan},
publisher = {USENIX Association},
month = jun
}
connect with us