Machine Learning for Predicting the Risk for Childhood Asthma Using Prenatal, Perinatal, Postnatal and Environmental Factors

Healthcare (Basel). 2021 Oct 29;9(11):1464. doi: 10.3390/healthcare9111464.

Abstract

The prevalence rate for childhood asthma and its associated risk factors vary significantly across countries and regions. In the case of Morocco, the scarcity of available medical data makes scientific research on diseases such as asthma very challenging. In this paper, we build machine learning models to predict the occurrence of childhood asthma using data from a prospective study of 202 children with and without asthma. The association between different factors and asthma diagnosis is first assessed using a Chi-squared test. Then, predictive models such as logistic regression analysis, decision trees, random forest and support vector machine are used to explore the relationship between childhood asthma and the various risk factors. First, data were pre-processed using a Chi-squared feature selection, 19 out of the 36 factors were found to be significantly associated (p-value < 0.05) with childhood asthma; these include: history of atopic diseases in the family, presence of mites, cold air, strong odors and mold in the child's environment, mode of birth, breastfeeding and early life habits and exposures. For asthma prediction, random forest yielded the best predictive performance (accuracy = 84.9%), followed by logistic regression (accuracy = 82.57%), support vector machine (accuracy = 82.5%) and decision trees (accuracy = 75.19%). The decision tree model has the advantage of being easily interpreted. This study identified important maternal and prenatal risk factors for childhood asthma, the majority of which are avoidable. Appropriate steps are needed to raise awareness about the prenatal risk factors.

Keywords: asthma; environment; machine learning; pediatrics; prediction; prevention; risk factors.