A Review of Anonymization for Healthcare Data

Big Data. 2022 Mar 10. doi: 10.1089/big.2021.0169. Online ahead of print.

Abstract

Mining health data can lead to faster medical decisions, improvement in the quality of treatment, disease prevention, and reduced cost, and it drives innovative solutions within the healthcare sector. However, health data are highly sensitive and subject to regulations such as the General Data Protection Regulation, which aims to ensure patient's privacy. Anonymization or removal of patient identifiable information, although the most conventional way, is the first important step to adhere to the regulations and incorporate privacy concerns. In this article, we review the existing anonymization techniques and their applicability to various types (relational and graph based) of health data. Besides, we provide an overview of possible attacks on anonymized data. We illustrate via a reconstruction attack that anonymization, although necessary, is not sufficient to address patient privacy and discuss methods for protecting against such attacks. Finally, we discuss tools that can be used to achieve anonymization.

Keywords: anonymization; attacks; healthcare data; privacy.