Evaluation of GPT-4's Chest X-Ray Impression Generation: A Reader Study on Performance and Perception

J Med Internet Res. 2023 Dec 22:25:e50865. doi: 10.2196/50865.

Abstract

Exploring the generative capabilities of the multimodal GPT-4, our study uncovered significant differences between radiological assessments and automatic evaluation metrics for chest x-ray impression generation and revealed radiological bias.

Keywords: AI; GPT; artificial intelligence; chest; diagnostic; generative; generative model; image; images; imaging; impression; impressions; medical imaging; multimodal; radiography; radiological; radiology; x-ray; x-rays.

MeSH terms

  • Benchmarking
  • Humans
  • Perception
  • Radiography
  • Radiology*
  • X-Rays