Kung Faux Pandas Simplifying privacy protection

AMIA Jt Summits Transl Sci Proc. 2019 May 6:2019:267-274. eCollection 2019.

Abstract

There are many barriers to data access and data sharing, especially in the domain of computational research using health care data. Legal constraints, such as HIPAA, protect patient privacy but slow access to data and limit reproducibility. We provide a description of an end-to-end system called Kung Faux Pandas for easily generating de-identified or synthetic data which is statistically similar to real data but lacks sensitive information. This system focuses on data synthesis and de-identification narrowed to a specific research question to allow for self-service data access without the complexities required to generate an entire population of data that is not needed for a given research project. Kung Faux Pandas is an open source publicly availableb system that lowers barriers to HIPAA- and GDPR-compliant data sharing for enabling reproducibility and other purposes.