An Approach for Generating Realistic Australian Synthetic Healthcare Data

Stud Health Technol Inform. 2024 Jan 25:310:820-824. doi: 10.3233/SHTI231079.

Abstract

Healthcare data is a scarce resource and access is often cumbersome. While medical software development would benefit from real datasets, the privacy of the patients is held at a higher priority. Realistic synthetic healthcare data can fill this gap by providing a dataset for quality control while at the same time preserving the patient's anonymity and privacy. Existing methods focus on American or European patient healthcare data but none is exclusively focused on the Australian population. Australia is a highly diverse country that has a unique healthcare system. To overcome this problem, we used a popular publicly available tool, Synthea, to generate disease progressions based on the Australian population. With this approach, we were able to generate 100,000 patients following Queensland (Australia) demographics.

Keywords: Australia; Synthea; Synthetic healthcare data; data simulation.

MeSH terms

  • Australia
  • Disease Progression
  • Health Facilities*
  • Humans
  • Privacy*
  • Queensland