Evaluation of Online Artificial Intelligence-Generated Information on Common Hand Procedures

J Hand Surg Am. 2023 Nov;48(11):1122-1127. doi: 10.1016/j.jhsa.2023.08.003. Epub 2023 Sep 9.

Abstract

Purpose: The purpose of this study was to analyze the quality and readability of the information generated by an online artificial intelligence (AI) platform regarding 4 common hand surgeries and to compare AI-generated responses to those provided in the informational articles published by the American Society for Surgery of the Hand (ASSH) HandCare website.

Methods: An open AI model (ChatGPT) was used to answer questions commonly asked by patients on 4 common hand surgeries (carpal tunnel release, cubital tunnel release, trigger finger release, and distal radius fracture fixation). These answers were evaluated for medical accuracy, quality and readability and compared to answers derived from the ASSH HandCare materials.

Results: For the AI model, the Journal of the American Medical Association benchmark criteria score was 0/4, and the DISCERN score was 58 (considered good). The areas in which the AI model lost points were primarily related to the lack of attribution, reliability and currency of the source material. For AI responses, the mean Flesch Kinkaid Reading Ease score was 15, and the Flesch Kinkaid Grade Level was 34, which is considered to be college level. For comparison, ASSH HandCare materials scored 3/4 on the Journal of the American Medical Association Benchmark, 71 on DISCERN (excellent), 9 on Flesch Kinkaid Grade Level, and 60 on Flesch Kinkaid Reading Ease score (eighth/ninth grade level).

Conclusion: An AI language model (ChatGPT) provided generally high-quality answers to frequently asked questions relating to the common hand procedures queried, but it is unclear when or where these answers came from without citations to source material. Furthermore, a high reading level was required to comprehend the information presented. The AI software repeatedly referenced the need to discuss these questions with a surgeon, the importance of shared decision-making and individualized care, and compliance with surgeon treatment recommendations.

Clinical relevance: As novel AI applications become increasingly mainstream, hand surgeons must understand the limitations and ramifications these technologies have for patient care.

Keywords: Artificial intelligence; health literacy; internet; patient education; readability.

MeSH terms

  • Artificial Intelligence
  • Comprehension
  • Hand / surgery
  • Health Literacy*
  • Humans
  • Internet
  • Reproducibility of Results
  • United States