Privacy-preserving biomedical data dissemination via a hybrid approach

AMIA Annu Symp Proc. 2018 Dec 5:2018:1176-1185. eCollection 2018.

Abstract

Sharing medical data can benefit many aspects of biomedical research studies. However, medical data usually contains sensitive patient information, which cannot be shared directly. Summary statistics, like histogram, are widely used in medical research which serves as a sanitized synopsis of the raw health dataset such as Electrical Health Records (EHR). Such synopsized representation is then be used to support advanced operations over health dataset such as counting queries and learning based tasks. While privacy becomes an increasingly important issue for generating and publishing health data based histograms. Previous solutions show promise on securely generating histogram via differential privacy, however such methods only consider a centralized solution and the accuracy is still a limitation for real world applications. In this paper, we propose a novel hybrid solution to combine two rigorous theoretical models (homomorphic encryption and differential privacy) for securely generating synthetic V-optimal histograms over distributed datasets. Our results demonstrated accuracy improvement over previous study over real medical datasets.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computer Security*
  • Confidentiality*
  • Electronic Health Records*
  • Humans
  • Information Dissemination
  • Medical Records Systems, Computerized
  • Models, Theoretical
  • Privacy