Performance of ChatGPT in Israeli Hebrew Internal Medicine National Residency Exam

Isr Med Assoc J. 2024 Feb;26(2):86-88.

Abstract

Background: Completing internal medicine specialty training in Israel involves passing the Israel National Internal Medicine Exam (Shlav Aleph), a challenging multiple-choice test. multiple-choice test. Chat generative pre-trained transformer (ChatGPT) 3.5, a language model, is increasingly used for exam preparation.

Objectives: To assess the ability of ChatGPT 3.5 to pass the Israel National Internal Medicine Exam in Hebrew.

Methods: Using the 2023 Shlav Aleph exam questions, ChatGPT received prompts in Hebrew. Textual questions were analyzed after the appeal, comparing its answers to the official key.

Results: ChatGPT 3.5 correctly answered 36.6% of the 133 analyzed questions, with consistent performance across topics, except for challenges in nephrology and biostatistics.

Conclusions: While ChatGPT 3.5 has excelled in English medical exams, its performance in the Hebrew Shlav Aleph was suboptimal. Factors include limited training data in Hebrew, translation complexities, and unique language structures. Further investigation is essential for its effective adaptation to Hebrew medical exam preparation.

MeSH terms

  • Biometry
  • Humans
  • Internal Medicine
  • Internship and Residency*
  • Israel
  • Language