Opportunities and Challenges of Synthetic Data Generation in Oncology

JCO Clin Cancer Inform. 2023 Aug:7:e2300045. doi: 10.1200/CCI.23.00045.

Abstract

Widespread interest in artificial intelligence (AI) in health care has focused mainly on deductive systems that analyze available real-world data to discover patterns not otherwise visible. Generative adversarial network, a new type of inductive AI, has recently evolved to generate high-fidelity virtual synthetic data (SD) trained on relatively limited real-world information. The AI system is fed with a collection of real data, and it learns to generate new augmented data while maintaining the general characteristics of the original data set. The use of SD to enhance clinical research and protect patient privacy has drawn a lot of interest in medicine and in the complex field of oncology. This article summarizes the main characteristics of this innovative technology and critically discusses how it can be used to accelerate data access for secondary purposes, providing an overview of the opportunities and challenges of SD generation for clinical cancer research and health care.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence*
  • Humans
  • Medical Oncology*