An Analysis of Erlangen University Hospital's Billing Data on Utility-Based De-Identification

Stud Health Technol Inform. 2019:258:70-74.

Abstract

Background: To make patient care data more accessible for research, German university hospitals join forces in the course of the Medical Informatics Initiative. In a first step, the administrative data of university hospitals is made available for federated utilization. Project-specific de-identification of this data is necessary to satisfy privacy laws.

Objective: We want to make a statement about the population uniqueness of the data. By generalizing the data, we try to reduce uniqueness and improve k-anonymity.

Methods: We analyze quasi-identifying attributes of the Erlangen University Hospital's billing data regarding population uniqueness and re-identification risk. We count individuals per equality class (k) to measure uniqueness.

Results: Because of the diagnoses and procedures being particularly unique in combination with sex and age of the patients, the data set is not anonymized in matters of k-anonymity with k > 1 . We are able to reduce population uniqueness with generalization and suppression of unique domains.

Conclusion: To create k-anonymity with k > 1 while still maintaining a particular utility of the data, we need to apply further established strategies of de-identification.

Keywords: Data privacy; k-anonymity; population uniqueness; risk; secondary use.

MeSH terms

  • Data Anonymization*
  • Fees and Charges
  • Hospitals, University*
  • Humans
  • Maintenance
  • Medical Informatics*
  • Privacy