Domain Knowledge-Driven Generation of Synthetic Healthcare Data

Stud Health Technol Inform. 2023 May 18:302:352-353. doi: 10.3233/SHTI230136.

Abstract

Healthcare longitudinal data collected around patients' life cycles, today offer a multitude of opportunities for healthcare transformation utilizing artificial intelligence algorithms. However, access to "real" healthcare data is a big challenge due to ethical and legal reasons. There is also a need to deal with challenges around electronic health records (EHRs) including biased, heterogeneity, imbalanced data, and small sample sizes. In this study, we introduce a domain knowledge-driven framework for generating synthetic EHRs, as an alternative to methods only using EHR data or expert knowledge. By leveraging external medical knowledge sources in the training algorithm, the suggested framework is designed to maintain data utility, fidelity, and clinical validity while preserving patient privacy.

Keywords: Domain Knowledge; EHR; Representation Learning; Synthetic Data.

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Confidentiality
  • Electronic Health Records*
  • Humans