Privacy Side Channels in Machine Learning Systems

Authors: 

Edoardo Debenedetti, ETH Zurich; Giorgio Severi, Northeastern University; Nicholas Carlini, Christopher A. Choquette-Choo, Matthew Jagielski, and Milad Nasr, Google DeepMind; Eric Wallace, UC Berkeley; Florian Tramèr, ETH Zurich

Abstract: 

Most current approaches for protecting privacy in machine learning (ML) assume that models exist in a vacuum. Yet, in reality, these models are part of larger systems that include components for training data filtering, output monitoring, and more. In this work, we introduce privacy side channels: attacks that exploit these system-level components to extract private information at far higher rates than is otherwise possible for standalone models. We propose four categories of side channels that span the entire ML lifecycle (training data filtering, input preprocessing, output post-processing, and query filtering) and allow for enhanced membership inference, data extraction, and even novel threats such as extraction of users' test queries. For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees. We further show that systems which block language models from regenerating training data can be exploited to exfiltrate private keys contained in the training set—even if the model did not memorize these keys. Taken together, our results demonstrate the need for a holistic, end-to-end privacy analysis of machine learning systems.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {299864,
author = {Edoardo Debenedetti and Giorgio Severi and Nicholas Carlini and Christopher A. Choquette-Choo and Matthew Jagielski and Milad Nasr and Eric Wallace and Florian Tram{\`e}r},
title = {Privacy Side Channels in Machine Learning Systems},
booktitle = {33rd USENIX Security Symposium (USENIX Security 24)},
year = {2024},
isbn = {978-1-939133-44-1},
address = {Philadelphia, PA},
pages = {6861--6848},
url = {https://www.usenix.org/conference/usenixsecurity24/presentation/debenedetti},
publisher = {USENIX Association},
month = aug
}

Presentation Video