Comparative analysis of ChatGPT, Gemini and emergency medicine specialist in ESI triage assessment

Gürbüz Meral; Serdal Ateş; Serkan Günay; Ahmet Öztürk; Mikail Kuşdoğan

doi:10.1016/j.ajem.2024.05.001

Comparative analysis of ChatGPT, Gemini and emergency medicine specialist in ESI triage assessment

Am J Emerg Med. 2024 May 3:81:146-150. doi: 10.1016/j.ajem.2024.05.001. Online ahead of print.

Authors

Gürbüz Meral¹, Serdal Ateş², Serkan Günay², Ahmet Öztürk², Mikail Kuşdoğan²

Affiliations

¹ Department of Emergency Medicine, Specialist in Emergency Medicine, Hitit University Çorum Erol Olçok Education and Research Hospital, Çorum, Turkey. Electronic address: gurbuzmeral61@gmail.com.
² Department of Emergency Medicine, Specialist in Emergency Medicine, Hitit University Çorum Erol Olçok Education and Research Hospital, Çorum, Turkey.

PMID: 38728938
DOI: 10.1016/j.ajem.2024.05.001

Abstract

Introduction: The term Artificial Intelligence (AI) was first coined in the 1960s and has made significant progress up to the present day. During this period, numerous AI applications have been developed. GPT-4 and Gemini are two of the best-known of these AI models. As a triage system The Emergency Severity Index (ESI) is currently one of the most commonly used for effective patient triage in the emergency department. The aim of this study is to evaluate the performance of GPT-4, Gemini, and emergency medicine specialists in ESI triage against each other; furthermore, it aims to contribute to the literature on the usability of these AI programs in emergency department triage.

Methods: Our study was conducted between February 1, 2024, and February 29, 2024, among emergency medicine specialists in Turkey, as well as with GPT-4 and Gemini. Ten emergency medicine specialists were included in our study but as a limitation the emergency medicine specialists participating in the study do not frequently use the ESI triage model in daily practice. In the first phase of our study, 100 case examples related to adult or trauma patients were extracted from the sample and training cases found in the ESI Implementation Handbook. In the second phase of our study, the provided responses were categorized into three groups: correct triage, over-triage, and under-triage. In the third phase of our study, the questions were categorized according to the correct triage responses.

Results: In the results of our study, a statistically significant difference was found between the three groups in terms of correct triage, over-triage, and under-triage (p < 0.001). GPT-4 was found to have the highest correct triage rate with an average of 70.60 (±3.74), while Gemini had the highest over-triage rate with an average of 35.2 (±2.93) (p < 0.001). The highest under-triage rate was observed in emergency medicine specialists (32.90 (±11.83)). In the ESI 1-2 class, Gemini had a correct triage rate of 87.77%, GPT-4 had 85.11%, and emergency medicine specialists had 49.33%.

Conclusion: In conclusion, our study shows that both GPT-4 and Gemini can accurately triage critical and urgent patients in ESI 1&2 groups at a high rate. Furthermore, GPT-4 has been more successful in ESI triage for all patients. These results suggest that GPT-4 and Gemini could assist in accurate ESI triage of patients in emergency departments.

Keywords: Emergency severity index; GPT-4; Gemini; Triage.