Appraising the performance of ChatGPT in psychiatry using 100 clinical case vignettes

Asian J Psychiatr. 2023 Nov:89:103770. doi: 10.1016/j.ajp.2023.103770. Epub 2023 Sep 20.

Abstract

Background: ChatGPT has emerged as the most advanced and rapidly developing large language chatbot system. With its immense potential ranging from answering a simple query to cracking highly competitive medical exams, ChatGPT continues to impress the scientists and researchers worldwide giving room for more discussions regarding its utility in various fields. One such field of attention is Psychiatry. With suboptimal diagnosis and treatment, assuring mental health and well-being is a challenge in many countries, particularly developing nations. To this regard, we conducted an evaluation to assess the performance of ChatGPT 3.5 in Psychiatry using clinical cases to provide evidence-based information regarding the implication of ChatGPT 3.5 in enhancing mental health and well-being.

Methods: ChatGPT 3.5 was used in this experimental study to initiate the conversations and collect responses to clinical vignettes in Psychiatry. Using 100 clinical case vignettes, the replies were assessed by expert faculties from the Department of Psychiatry. There were 100 different psychiatric illnesses represented in the cases. We recorded and assessed the initial ChatGPT 3.5 responses. The evaluation was conducted using the objective of questions that were put forth at the conclusion of the case, and the aim of the questions was divided into 10 categories. The grading was completed by taking the mean value of the scores provided by the evaluators. Graphs and tables were used to represent the grades.

Results: The evaluation report suggests that ChatGPT 3.5 fared extremely well in Psychiatry by receiving "Grade A" ratings in 61 out of 100 cases, "Grade B" ratings in 31, and "Grade C" ratings in 8. Majority of the queries were concerned with the management strategies, which were followed by diagnosis, differential diagnosis, assessment, investigation, counselling, clinical reasoning, ethical reasoning, prognosis, and request acceptance. ChatGPT 3.5 performed extremely well, especially in generating management strategies followed by diagnoses for different psychiatric conditions. There were no responses which were graded "D" indicating that there were no errors in the diagnosis or response for clinical care. Only a few discrepancies and additional details were missed in a few responses that received a "Grade C" CONCLUSION: It is evident from our study that ChatGPT 3.5 has appreciable knowledge and interpretation skills in Psychiatry. Thus, ChatGPT 3.5 undoubtedly has the potential to transform the field of Medicine and we emphasize its utility in Psychiatry through the finding of our study. However, for any AI model to be successful, assuring the reliability, validation of information, proper guidelines and implementation framework are necessary.

Keywords: “Artificial Intelligence”, “ChatGPT”; “Mental health”; “Mental well-being”; “Psychiatry”.

MeSH terms

  • Communication
  • Diagnosis, Differential
  • Humans
  • Mental Disorders* / diagnosis
  • Mental Disorders* / drug therapy
  • Psychiatry*
  • Reproducibility of Results