ChatGPT-Exploring Its Role in Clinical Chemistry

Ann Clin Lab Sci. 2023 Nov;53(6):835-839.

Abstract

Objective: To evaluate the utility of artificial intelligence-powered language models (ChatGPT 3.5 and GPT-4) compared to trainees and clinical chemists in responding to common laboratory questions in the broad area of Clinical Chemistry.

Methods: 35 questions from real-life case scenarios, clinical consultations, and clinical chemistry testing questions were used to evaluate ChatGPT 3.5, and GPT-4 alongside clinical chemistry trainees (residents/fellows) and clinical chemistry faculty. The responses were scored based on category and based on years of experience.

Results: The Senior Chemistry Faculty demonstrated superior accuracy with 100% of correct responses compared to 90.5%, 82.9%, and 71.4% of correct responses from the junior chemistry faculty, fellows, and residents respectively. They all outperformed both ChatGPT 3.5 and GPT-4 which generated 60% and 71.4% correct responses respectively. Of the sub-categories examined, ChatGPT 3.5 achieved 100% accuracy in endocrinology while GPT-4 did not achieve 100% accuracy in any subcategory. GPT-4 was overall better than ChatGPT 3.5 by generating similar correct responses as residents (71.4%) but performed poorly to human participants when both partially correct and incorrect indices were considered.

Conclusion: Despite all the advances in AI-powered language models, ChatGPT 3.5 and GPT-4 cannot replace a trained pathologist in answering clinical chemistry questions. Caution should be observed by people, especially those not trained in clinical chemistry, to interpret test results using chatbots.

MeSH terms

  • Artificial Intelligence*
  • Chemistry, Clinical*
  • Humans
  • Laboratories
  • Pathologists