Assessing AI-Powered Patient Education: A Case Study in Radiology

Acad Radiol. 2024 Jan;31(1):338-342. doi: 10.1016/j.acra.2023.08.020. Epub 2023 Sep 14.

Abstract

Rationale and objectives: With recent advancements in the power and accessibility of artificial intelligence (AI) Large Language Models (LLMs) patients might increasingly turn to these platforms to answer questions regarding radiologic examinations and procedures, despite valid concerns about the accuracy of information provided. This study aimed to assess the accuracy and completeness of information provided by the Bing Chatbot-a LLM powered by ChatGPT-on patient education for common radiologic exams.

Materials and methods: We selected three common radiologic examinations and procedures: computed tomography (CT) abdomen, magnetic resonance imaging (MRI) spine, and bone biopsy. For each, ten questions were tested on the chatbot in two trials using three different chatbot settings. Two reviewers independently assessed the chatbot's responses for accuracy and completeness compared to an accepted online resource, radiologyinfo.org.

Results: Of the 360 reviews performed, 336 (93%) were rated "entirely correct" and 24 (7%) were "mostly correct," indicating a high level of reliability. Completeness ratings showed that 65% were "complete" and 35% were "mostly complete." The "More Creative" chatbot setting produced a higher proportion of responses rated "entirely correct" but there were otherwise no significant difference in ratings based on chatbot settings or exam types. The readability level was rated eighth-grade level.

Conclusion: The Bing Chatbot provided accurate responses answering all or most aspects of the question asked of it, with responses tending to err on the side of caution for nuanced questions. Importantly, no responses were inaccurate or had potential to cause harm or confusion for the user. Thus, LLM chatbots demonstrate potential to enhance patient education in radiology and could be integrated into patient portals for various purposes, including exam preparation and results interpretation.

Keywords: Artificial intelligence; Bing Chatbot; Large language models; Patient education.

MeSH terms

  • Artificial Intelligence*
  • Humans
  • Patient Education as Topic
  • Radiography
  • Radiology*
  • Reproducibility of Results