usenix conference policies
Supporting Undo and Redo in Scientific Data Analysis
Xiang Zhao, University of Massachusetts, Amherst; Emery R. Boose, Harvard University; Yuriy Brun, University of Massachusetts, Amherst; Barbara Staudt Lerner, Mount Holyoke College; Leon J. Osterweil, University of Massachusetts, Amherst
This paper presents a provenance-based technique to support undoing and redoing data analysis tasks. Our technique targets scientists who experiment with combinations of approaches to processing raw data into presentable datasets. Raw data may be noisy and in need of cleaning, it may suffer from sensor drift that requires retrospective calibration and data correction, or it mayneed gap-filling due to sensor malfunction or environmental conditions. Different raw datasets may have different issues requiring different kinds of adjustments, and each issue may potentially be handled by different approaches. Thus, scientists must often experiment with different sequences of approaches. In our work, we show how provenance information can be used to facilitate this kind of experimentation with scientific datasets. We describe an approach that supports the ability to (1) undo a set of tasks while setting aside the artifacts and consequences of performing those tasks, (2) replace, remove, or add a data-processing technique, and (3) redo automatically those set aside tasks that are consistent with changed technique. We have implemented our technique and demonstrate its utility with a case study of a common, sensor-network, data-processing scenario showing how our approach can reduce the cost of changing intermediate data-processing techniques in a complex, data-intensive process.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Xiang Zhao and Emery R. Boose and Yuriy Brun and Barbara Staudt Lerner and Leon J. Osterweil},
title = {Supporting Undo and Redo in {Scientific} Data Analysis},
booktitle = {5th USENIX Workshop on the Theory and Practice of Provenance (TaPP 13)},
year = {2013},
address = {Lombard, IL},
url = {https://www.usenix.org/conference/tapp13/technical-sessions/presentation/zhao},
publisher = {USENIX Association},
month = apr
}
connect with us