Evaluating diagnostic content of AI-generated chest radiography: A multi-center visual Turing test

PLoS One. 2023 Apr 12;18(4):e0279349. doi: 10.1371/journal.pone.0279349. eCollection 2023.

Abstract

Background: Accurate interpretation of chest radiographs requires years of medical training, and many countries face a shortage of medical professionals to meet such requirements. Recent advancements in artificial intelligence (AI) have aided diagnoses; however, their performance is often limited due to data imbalance. The aim of this study was to augment imbalanced medical data using generative adversarial networks (GANs) and evaluate the clinical quality of the generated images via a multi-center visual Turing test.

Methods: Using six chest radiograph datasets, (MIMIC, CheXPert, CXR8, JSRT, VBD, and OpenI), starGAN v2 generated chest radiographs with specific pathologies. Five board-certified radiologists from three university hospitals, each with at least five years of clinical experience, evaluated the image quality through a visual Turing test. Further evaluations were performed to investigate whether GAN augmentation enhanced the convolutional neural network (CNN) classifier performances.

Results: In terms of identifying GAN images as artificial, there was no significant difference in the sensitivity between radiologists and random guessing (result of radiologists: 147/275 (53.5%) vs result of random guessing: 137.5/275, (50%); p = .284). GAN augmentation enhanced CNN classifier performance by 11.7%.

Conclusion: Radiologists effectively classified chest pathologies with synthesized radiographs, suggesting that the images contained adequate clinical information. Furthermore, GAN augmentation enhanced CNN performance, providing a bypass to overcome data imbalance in medical AI training. CNN based methods rely on the amount and quality of training data; the present study showed that GAN augmentation could effectively augment training data for medical AI.

Publication types

  • Multicenter Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence*
  • Certification
  • Hospitals, University
  • Humans
  • Neural Networks, Computer*
  • Radiography

Grants and funding

Y.M. received grant from the MD-PhD/Medical Scientist Training Program through the Korea Health Industry Development Institute funded by the Korean government (Ministry of Health and Welfare, http://www.mohw.go.kr/eng/index.jsp). M.C. received grant from National Research Foundation of Korea (NRF) grant funded by the Korean government (Ministry of ICT, Science, and Technology, https://www.msit.go.kr/eng/index.do) (No-2021R1C1C2095529). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.