"What do you want from theory alone?" Experimenting with Tight Auditing of Differentially Private Synthetic Data Generation

Authors: 

Meenatchi Sundaram Muthu Selva Annamalai, University College London; Georgi Ganev, University College London and Hazy; Emiliano De Cristofaro, University of California, Riverside

Abstract: 

Differentially private synthetic data generation (DP-SDG) algorithms are used to release datasets that are structurally and statistically similar to sensitive data while providing formal bounds on the information they leak. However, bugs in algorithms and implementations may cause the actual information leakage to be higher. This prompts the need to verify whether the theoretical guarantees of state-of-the-art DP-SDG implementations also hold in practice. We do so via a rigorous auditing process: we compute the information leakage via an adversary playing a distinguishing game and running membership inference attacks (MIAs). If the leakage observed empirically is higher than the theoretical bounds, we identify a DP violation; if it is non-negligibly lower, the audit is loose.

We audit six DP-SDG implementations using different datasets and threat models and find that black-box MIAs commonly used against DP-SDGs are severely limited in power, yielding remarkably loose empirical privacy estimates. We then consider MIAs in stronger threat models, i.e., passive and active white-box, using both existing and newly proposed attacks. Overall, we find that, currently, we do not only need white-box MIAs but also worst-case datasets to tightly estimate the privacy leakage from DP-SDGs. Finally, we show that our automated auditing procedure finds both known DP violations (in 4 out of the 6 implementations) as well as a new one in the DPWGAN implementation that was successfully submitted to the NIST DP Synthetic Data Challenge.

The source code needed to reproduce our experiments is available from https://github.com/spalabucr/synth-audit.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {299750,
author = {Meenatchi Sundaram Muthu Selva Annamalai and Georgi Ganev and Emiliano De Cristofaro},
title = {"What do you want from theory alone?" Experimenting with Tight Auditing of Differentially Private Synthetic Data Generation},
booktitle = {33rd USENIX Security Symposium (USENIX Security 24)},
year = {2024},
isbn = {978-1-939133-44-1},
address = {Philadelphia, PA},
pages = {4855--4871},
url = {https://www.usenix.org/conference/usenixsecurity24/presentation/annamalai-theory},
publisher = {USENIX Association},
month = aug
}