openPDS: protecting the privacy of metadata through SafeAnswers

PLoS One. 2014 Jul 9;9(7):e98790. doi: 10.1371/journal.pone.0098790. eCollection 2014.

Abstract

The rise of smartphones and web services made possible the large-scale collection of personal metadata. Information about individuals' location, phone call logs, or web-searches, is collected and used intensively by organizations and big data researchers. Metadata has however yet to realize its full potential. Privacy and legal concerns, as well as the lack of technical solutions for personal metadata management is preventing metadata from being shared and reconciled under the control of the individual. This lack of access and control is furthermore fueling growing concerns, as it prevents individuals from understanding and managing the risks associated with the collection and use of their data. Our contribution is two-fold: (1) we describe openPDS, a personal metadata management framework that allows individuals to collect, store, and give fine-grained access to their metadata to third parties. It has been implemented in two field studies; (2) we introduce and analyze SafeAnswers, a new and practical way of protecting the privacy of metadata at an individual level. SafeAnswers turns a hard anonymization problem into a more tractable security one. It allows services to ask questions whose answers are calculated against the metadata instead of trying to anonymize individuals' metadata. The dimensionality of the data shared with the services is reduced from high-dimensional metadata to low-dimensional answers that are less likely to be re-identifiable and to contain sensitive information. These answers can then be directly shared individually or in aggregate. openPDS and SafeAnswers provide a new way of dynamically protecting personal metadata, thereby supporting the creation of smart data-driven services and data science research.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Confidentiality
  • Data Collection
  • Humans
  • Information Storage and Retrieval*
  • Privacy*
  • Software*

Grants and funding

This research was partially sponsored by the Army Research Laboratory under Cooperative Agreement Number W911NF-09-2-0053, by the Center for Complex Engineering Systems, and by the Media Lab Consortium. The conclusions in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the sponsors. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.