Benjamin Andow and Samin Yaseer Mahmud, North Carolina State University; Wenyu Wang, University of Illinois at Urbana-Champaign; Justin Whitaker, William Enck, and Bradley Reaves, North Carolina State University; Kapil Singh, IBM T.J. Watson Research Center; Tao Xie, University of Illinois at Urbana-Champaign
Privacy policies are the primary mechanism by which companies inform users about data collection and sharing practices. To help users better understand these long and complex legal documents, recent research has proposed tools that summarize collection and sharing. However, these tools have a significant oversight: they do not account for contradictions that may occur within an individual policy. In this paper, we present PolicyLint, a privacy policy analysis tool that identifies such contradictions by simultaneously considering negation and varying semantic levels of data objects and entities. To do so, PolicyLint automatically generates ontologies from a large corpus of privacy policies and uses sentence-level natural language processing to capture both positive and negative statements of data collection and sharing. We use PolicyLint to analyze the policies of 11,430 apps and find that 14.2% of these policies contain contradictions that may be indicative of misleading statements. We manually verify 510 contradictions, identifying concerning trends that include the use of misleading presentation, attempted redefinition of common understandings of terms, conflicts in regulatory definitions (e.g., US and EU), and "laundering" of tracking information facilitated by sharing or collecting data that can be used to derive sensitive information. In doing so, PolicyLint significantly advances automated analysis of privacy policies.
USENIX Security '19 Open Access Videos Sponsored by
King Abdullah University of Science and Technology (KAUST)
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Benjamin Andow and Samin Yaseer Mahmud and Wenyu Wang and Justin Whitaker and William Enck and Bradley Reaves and Kapil Singh and Tao Xie},
title = {{PolicyLint}: Investigating Internal Privacy Policy Contradictions on Google Play},
booktitle = {28th USENIX Security Symposium (USENIX Security 19)},
year = {2019},
isbn = {978-1-939133-06-9},
address = {Santa Clara, CA},
pages = {585--602},
url = {https://www.usenix.org/conference/usenixsecurity19/presentation/andow},
publisher = {USENIX Association},
month = aug
}