Machine Learning-Based Prediction Models for Depression Symptoms Among Chinese Healthcare Workers During the Early COVID-19 Outbreak in 2020: A Cross-Sectional Study

Zhaohe Zhou; Dan Luo; Bing Xiang Yang; Zhongchun Liu

doi:10.3389/fpsyt.2022.876995

Machine Learning-Based Prediction Models for Depression Symptoms Among Chinese Healthcare Workers During the Early COVID-19 Outbreak in 2020: A Cross-Sectional Study

Front Psychiatry. 2022 Apr 29:13:876995. doi: 10.3389/fpsyt.2022.876995. eCollection 2022.

Authors

Zhaohe Zhou¹, Dan Luo^{2

3}, Bing Xiang Yang^{2

3

4}, Zhongchun Liu⁴

Affiliations

¹ School of Basic Medical Sciences, Chengdu University, Chengdu, China.
² School of Nursing, Wuhan University, Wuhan, China.
³ Population and Health Research Center, Wuhan University, Wuhan, China.
⁴ Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan, China.

Abstract

Background: The 2019 novel coronavirus (COVID-19)-related depression symptoms of healthcare workers have received worldwide recognition. Although many studies identified risk exposures associated with depression symptoms among healthcare workers, few have focused on a predictive model using machine learning methods. As a society, governments, and organizations are concerned about the need for immediate interventions and alert systems for healthcare workers who are mentally at-risk. This study aims to develop and validate machine learning-based models for predicting depression symptoms using survey data collected during the COVID-19 outbreak in China.

Method: Surveys were conducted of 2,574 healthcare workers in hospitals designated to care for COVID-19 patients between 20 January and 11 February 2020. The patient health questionnaire (PHQ)-9 was used to measure the depression symptoms and quantify the severity, a score of ≥5 on the PHQ-9 represented depression symptoms positive, respectively. Four machine learning approaches were trained (75% of data) and tested (25% of data). Cross-validation with 100 repetitions was applied to the training dataset for hyperparameter tuning. Finally, all models were compared to evaluate their predictive performances and screening utility: decision tree, logistics regression with least absolute shrinkage and selection operator (LASSO), random forest, and gradient-boosting tree.

Results: Important risk predictors identified and ranked by the machine learning models were highly consistent: self-perceived health status factors always occupied the top five most important predictors, followed by worried about infection, working on the frontline, a very high level of uncertainty, having received any form of psychological support material and having COVID-19-like symptoms. The area under the curve [95% CI] of machine learning models were as follows: LASSO model, 0.824 [0.792-0.856]; random forest, 0.828 [0.797-0.859]; gradient-boosting tree, 0.829 [0.798-0.861]; and decision tree, 0.785 [0.752-0.819]. The calibration plot indicated that the LASSO model, random forest, and gradient-boosting tree fit the data well. Decision curve analysis showed that all models obtained net benefits for predicting depression symptoms.

Conclusions: This study shows that machine learning prediction models are suitable for making predictions about mentally at-risk healthcare workers predictions in a public health emergency setting. The application of multidimensional machine learning models could support hospitals' and healthcare workers' decision-making on possible psychological interventions and proper mental health management.

Keywords: COVID-19; depression; health personnel; machine learning; predictive value of tests.