Utilizing Natural Language Processing of Narrative Feedback to Develop a Predictive Model of Pre-Clerkship Performance: Lessons Learned

Christina Maimone; Brigid M Dolan; Marianne M Green; Sandra M Sanguino; Patricia M Garcia; Celia Laird O'Brien

doi:10.5334/pme.40

Utilizing Natural Language Processing of Narrative Feedback to Develop a Predictive Model of Pre-Clerkship Performance: Lessons Learned

Perspect Med Educ. 2023 May 3;12(1):141-148. doi: 10.5334/pme.40. eCollection 2023.

Authors

Christina Maimone¹, Brigid M Dolan², Marianne M Green³, Sandra M Sanguino⁴, Patricia M Garcia⁵, Celia Laird O'Brien⁶

Affiliations

¹ Associate director of research data services, Northwestern IT Research Computing Services, Northwestern University, Evanston, Illinois, USA.
² Associate professor of medicine and medical education and director of assessment, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.
³ Raymond H. Curry, MD Professor of Medical Education, professor of medicine, and vice dean for education, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.
⁴ Associate professor of pediatrics and senior associate dean of medical education, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.
⁵ Professor of obstetrics and gynecology and medical education, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.
⁶ Assistant professor of medical education and assistant dean of program evaluation and accreditation, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.

PMID: 37151853
PMCID: PMC10162355
DOI: 10.5334/pme.40

Abstract

Background: Natural language processing is a promising technique that can be used to create efficiencies in the review of narrative feedback to learners. The Feinberg School of Medicine has implemented formal review of pre-clerkship narrative feedback since 2014 through its portfolio assessment system but this process requires considerable time and effort. This article describes how natural language processing was used to build a predictive model of pre-clerkship student performance that can be utilized to assist competency committee reviews.

Approach: The authors took an iterative and inductive approach to the analysis, which allowed them to identify characteristics of narrative feedback that are both predictive of performance and useful to faculty reviewers. Words and phrases were manually grouped into topics that represented concepts illustrating student performance. Topics were reviewed by experienced reviewers, tested for consistency across time, and checked to ensure they did not demonstrate bias.

Outcomes: Sixteen topic groups of words and phrases were found to be predictive of performance. The best-fitting model used a combination of topic groups, word counts, and categorical ratings. The model had an AUC value of 0.92 on the training data and 0.88 on the test data.

Reflection: A thoughtful, careful approach to using natural language processing was essential. Given the idiosyncrasies of narrative feedback in medical education, standard natural language processing packages were not adequate for predicting student outcomes. Rather, employing qualitative techniques including repeated member checking and iterative revision resulted in a useful and salient predictive model.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Education, Medical*
Feedback
Humans
Narration
Natural Language Processing
Students, Medical*

Grants and funding

This project was funded (in part) by a National Board of Medical Examiners (NBME) Edward J. Stemmler, MD Medical Education Research Fund grant. The project and the views expressed in this publication do not necessarily reflect the position or policy of NBME, and NBME support provides no official endorsement.