Data Provenance for Attributes: Attribute Lineage

Authors: 

Dennis Dosso, University of Padua; Susan B. Davidson, University of Pennsylvania; Gianmaria Silvello, University of Padua

Abstract: 

In this paper we define a new kind of data provenance for database management systems, called attribute lineage for SPJRU queries, building on previous works on data provenance for tuples.

We take inspiration from the classical lineage, metadata that enables users to discover which tuples in the input are used to produce a tuple in the output. Attribute lineage is instead defined as the set of all cells in the input database that are used by the query to produce one cell in the output.

It is shown that attribute lineage is more informative than simple lineage and we discuss potential new applications for this new metadata.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {255016,
author = {Dennis Dosso and Susan B. Davidson and Gianmaria Silvello},
title = {Data Provenance for Attributes: Attribute Lineage},
booktitle = {12th International Workshop on Theory and Practice of Provenance (TaPP 2020)},
year = {2020},
url = {https://www.usenix.org/conference/tapp2020/presentation/dosso},
publisher = {USENIX Association},
month = jun
}