Predicting state level suicide fatalities in the united states with realtime data and machine learning

Npj Ment Health Res. 2024 Jan 16;3(1):3. doi: 10.1038/s44184-023-00045-8.

Abstract

Digital trace data and machine learning techniques are increasingly being adopted to predict suicide-related outcomes at the individual level; however, there is also considerable public health need for timely data about suicide trends at the population level. Although significant geographic variation in suicide rates exist by state within the United States, national systems for reporting state suicide trends typically lag by one or more years. We developed and validated a deep learning based approach to utilize real-time, state-level online (Mental Health America web-based depression screenings; Google and YouTube Search Trends), social media (Twitter), and health administrative data (National Syndromic Surveillance Program emergency department visits) to estimate weekly suicide counts in four participating states. Specifically, per state, we built a long short-term memory (LSTM) neural network model to combine signals from the real-time data sources and compared predicted values of suicide deaths from our model to observed values in the same state. Our LSTM model produced accurate estimates of state-specific suicide rates in all four states (percentage error in suicide rate of -2.768% for Utah, -2.823% for Louisiana, -3.449% for New York, and -5.323% for Colorado). Furthermore, our deep learning based approach outperformed current gold-standard baseline autoregressive models that use historical death data alone. We demonstrate an approach to incorporate signals from multiple proxy real-time data sources that can potentially provide more timely estimates of suicide trends at the state level. Timely suicide data at the state level has the potential to improve suicide prevention planning and response tailored to the needs of specific geographic communities.