usenix conference policies
Provenance-aware Versioned Dataworkspaces
Xing Niu and Bahareh Sadat Arab, Illinois Institute of Technology; Dieter Gawlick, Zhen Hua Liu, and Vasudha Krishnaswamy, Oracle Corporation; Oliver Kennedy, University at Buffalo; Boris Glavic, Illinois Institute of Technology
Data preparation, curation, and analysis tasks are often exploratory in nature, with analysts incrementally designing workflows that transform, validate, and visualize their input sources. This requires frequent adjustments to data and workflows. Unfortunately, in current data management systems, even small changes can require time- and resource-heavy operations like materialization, manual version management, and re-execution. This added overhead discourages exploration. We present Provenance-aware Versioned Dataworkspaces (PVDs), our vision of a sandboxed environment in which users can apply—and more importantly, easily undo—changes to their data and workflows. A PVD keeps a log of the user’s operations in a light-weight version graph structure. We describe a model for PVDs that admits efficient automatic refresh, merging of histories, reenactment, and automated conflict resolution. We also highlight the conceptual and technical challenges that need to be overcome to create a practical PVD.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
title = {Provenance-aware Versioned Dataworkspaces},
booktitle = {8th USENIX Workshop on the Theory and Practice of Provenance (TaPP 16)},
year = {2016},
address = {Washington, D.C.},
url = {https://www.usenix.org/conference/tapp16/workshop-program/presentation/niu},
publisher = {USENIX Association},
month = jun
}
connect with us