Non-Fatal Drowning Risk Prediction Based on Stacking Ensemble Algorithm

Xinshan Xie; Zhixing Li; Haofeng Xu; Dandan Peng; Lihua Yin; Ruilin Meng; Wei Wu; Wenjun Ma; Qingsong Chen

doi:10.3390/children9091383

Non-Fatal Drowning Risk Prediction Based on Stacking Ensemble Algorithm

Children (Basel). 2022 Sep 14;9(9):1383. doi: 10.3390/children9091383.

Authors

Xinshan Xie^{1

2}, Zhixing Li^{2

3}, Haofeng Xu⁴, Dandan Peng⁴, Lihua Yin², Ruilin Meng⁴, Wei Wu^{1

2}, Wenjun Ma³, Qingsong Chen¹

Affiliations

¹ School of Public Health, Guangdong Pharmaceutical University, Guangzhou 510200, China.
² Guangdong Provincial Institute of Public Health, Guangdong Provincial Center for Disease Control and Prevention, Guangzhou 511430, China.
³ Department of Public Health, School of Medicine, Jinan University, Guangzhou 510630, China.
⁴ Guangdong Provincial Center for Disease Control and Prevention, Guangzhou 511430, China.

Abstract

Drowning is a major public health problem and a leading cause of death in children living in developing countries. We seek better machine learning (ML) algorithms to provide a novel risk-assessment insight on non-fatal drowning prediction. The data on non-fatal drowning were collected in Qingyuan city, Guangdong Province, China. We developed four ML models to predict the non-fatal drowning risk, including a logistic regression model (LR), random forest model (RF), support vector machine model (SVM), and stacking-based model, on three primary learners (LR, RF, SVM). The area under the curve (AUC), F1 value, accuracy, sensitivity, and specificity were calculated to evaluate the predictive ability of the different learning algorithms. This study included a total of 8390 children. Of those, 12.07% (1013) had experienced non-fatal drowning. We found the following risk factors are closely associated with the risk of non-fatal drowning: the frequency of swimming in open water, distance between the school and the surrounding open waters, swimming skills, personality (introvert) and relationality with family members. Compared to the other three base models, the stacking generalization model achieved a superior performance in the non-fatal drowning dataset (AUC = 0.741, sensitivity = 0.625, F1 value = 0.359, accuracy = 0.739 and specificity = 0.754). This study indicates that applying stacking ensemble algorithms in the non-fatal drowning dataset may outperform other ML models.

Keywords: drowning; machine learning; prediction; risk-factors; stacking ensemble.

Grants and funding

This work was primarily supported by the National Key Research and Development Program of China (2018YFA0606200). The Guangdong Provincial Science and Technology Program (2018B020207006).