Diagnostic capabilities of ChatGPT in ophthalmology

Graefes Arch Clin Exp Ophthalmol. 2024 Jan 6. doi: 10.1007/s00417-023-06363-z. Online ahead of print.

Abstract

Purpose: The purpose of this study is to assess the diagnostic accuracy of ChatGPT in the field of ophthalmology.

Methods: This is a retrospective cohort study conducted in one academic tertiary medical center. We reviewed data of patients admitted to the ophthalmology department from 06/2022 to 01/2023. We then created two clinical cases for each patient. The first case is according to the medical history alone (Hx). The second case includes an addition of the clinical examination (Hx and Ex). For each case, we asked for the three most likely diagnoses from ChatGPT, residents, and attendings. Then, we compared the accuracy rates (at least one correct diagnosis) of all groups. Additionally, we evaluated the total duration for completing the assignment between the groups.

Results: ChatGPT, residents, and attendings evaluated 126 cases from 63 patients (history only or history and exam findings for each patient). ChatGPT achieved a significantly lower accurate diagnosis rate (54%) in the Hx, as compared to the residents (75%; p < 0.01) and attendings (71%; p < 0.01). After adding the clinical examination findings, the diagnosis rate of ChatGPT was 68%, whereas for the residents and the attendings, it increased to 94% (p < 0.01) and 86% (p < 0.01), respectively. ChatGPT was 4 to 5 times faster than the attendings and residents.

Conclusions and relevance: ChatGPT showed low diagnostic rates in ophthalmology cases compared to residents and attendings based on patient history alone or with additional clinical examination findings. However, ChatGPT completed the task faster than the physicians.

Keywords: Artificial intelligence; ChatGPT; Diagnosis; Ophthalmology; Residents.