Ethical and Professional Decision-Making Capabilities of Artificial Intelligence Chatbots: Evaluating ChatGPT's Professional Competencies in Medicine

John C Lin; Sai S Kurapati; David N Younessi; Ingrid U Scott; Dan A Gong

doi:10.1007/s40670-024-02005-z

Ethical and Professional Decision-Making Capabilities of Artificial Intelligence Chatbots: Evaluating ChatGPT's Professional Competencies in Medicine

Med Sci Educ. 2024 Feb 17;34(2):331-333. doi: 10.1007/s40670-024-02005-z. eCollection 2024 Apr.

Authors

John C Lin^#¹, Sai S Kurapati^#², David N Younessi³, Ingrid U Scott², Dan A Gong⁴

Affiliations

¹ Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA.
² Departments of Ophthalmology and Public Health Sciences, Penn State College of Medicine, University Park, Hershey, PA USA.
³ Feinberg School of Medicine, Northwestern University, Chicago, IL USA.
⁴ Department of Ophthalmology, Harvard Medical School, 243 Charles Street, Boston, MA 02114 USA.

^# Contributed equally.

PMID: 38686158
PMCID: PMC11055821 (available on 2025-04-01)
DOI: 10.1007/s40670-024-02005-z

Abstract

Purpose: We examined the performance of artificial intelligence chatbots on the PREview Practice Exam, an online situational judgment test for professionalism and ethics.

Methods: We used validated methodologies to calculate scores and descriptive statistics, χ² tests, and Fisher's exact tests to compare scores by model and competency.

Results: GPT-3.5 and GPT-4 scored 6/9 (76th percentile) and 7/9 (92nd percentile), respectively, higher than medical school applicant averages of 5/9 (56th percentile). Both models answered 95 + % of questions correctly.

Conclusions: Chatbots outperformed the average applicant on PREview, suggesting their potential for healthcare training and decision-making and highlighting risks of online assessment delivery.

Keywords: AAMC; ChatGPT; OpenAI; PREview.