Integration and Validation of a Natural Language Processing Machine Learning Suicide Risk Prediction Model Based on Open-Ended Interview Language in the Emergency Department

Joshua Cohen; Jennifer Wright-Berryman; Lesley Rohlfs; Douglas Trocinski; LaMonica Daniel; Thomas W Klatt

doi:10.3389/fdgth.2022.818705

Integration and Validation of a Natural Language Processing Machine Learning Suicide Risk Prediction Model Based on Open-Ended Interview Language in the Emergency Department

Front Digit Health. 2022 Feb 2:4:818705. doi: 10.3389/fdgth.2022.818705. eCollection 2022.

Authors

Joshua Cohen¹, Jennifer Wright-Berryman², Lesley Rohlfs¹, Douglas Trocinski³, LaMonica Daniel⁴, Thomas W Klatt⁵

Affiliations

¹ Clarigent Health, Mason, OH, United States.
² Department of Social Work, College of Allied Health Sciences, University of Cincinnati, Cincinnati, OH, United States.
³ WPP Emergency Services, Raleigh, NC, United States.
⁴ WPP Clinical Research, Raleigh, NC, United States.
⁵ Behavioral Health Network, Raleigh, NC, United States.

Abstract

Background: Emergency departments (ED) are an important intercept point for identifying suicide risk and connecting patients to care, however, more innovative, person-centered screening tools are needed. Natural language processing (NLP) -based machine learning (ML) techniques have shown promise to assess suicide risk, although whether NLP models perform well in differing geographic regions, at different time periods, or after large-scale events such as the COVID-19 pandemic is unknown.

Objective: To evaluate the performance of an NLP/ML suicide risk prediction model on newly collected language from the Southeastern United States using models previously tested on language collected in the Midwestern US.

Method: 37 Suicidal and 33 non-suicidal patients from two EDs were interviewed to test a previously developed suicide risk prediction NLP/ML model. Model performance was evaluated with the area under the receiver operating characteristic curve (AUC) and Brier scores.

Results: NLP/ML models performed with an AUC of 0.81 (95% CI: 0.71-0.91) and Brier score of 0.23.

Conclusion: The language-based suicide risk model performed with good discrimination when identifying the language of suicidal patients from a different part of the US and at a later time period than when the model was originally developed and trained.

Keywords: emergency department (ED); feasibility & acceptability; machine learning; mental health; natural language processing; risk assessment; suicide; validation.