Risk Identification of Bronchopulmonary Dysplasia in Premature Infants Based on Machine Learning

Front Pediatr. 2021 Aug 17:9:719352. doi: 10.3389/fped.2021.719352. eCollection 2021.

Abstract

Bronchopulmonary dysplasia (BPD) is one of the most common complications in premature infants. This disease is caused by long-time use of supplemental oxygen, which seriously affects the lung function of the child and imposes a heavy burden on the family and society. This research aims to adopt the method of ensemble learning in machine learning, combining the Boruta algorithm and the random forest algorithm to determine the predictors of premature infants with BPD and establish a predictive model to help clinicians to conduct an optimal treatment plan. Data were collected from clinical records of 996 premature infants treated in the neonatology department of Liuzhou Maternal and Child Health Hospital in Western China. In this study, premature infants with congenital anomaly, premature infants who died, and premature infants with incomplete data before the diagnosis of BPD were excluded from the data set. After exclusion, we included 648 premature infants in the study. The Boruta algorithm and 10-fold cross-validation were used for feature selection in this study. Six variables were finally selected from the 26 variables, and the random forest model was established. The area under the curve (AUC) of the model was as high as 0.929 with excellent predictive performance. The use of machine learning methods can help clinicians predict the disease so as to formulate the best treatment plan.

Keywords: bronchopulmonary dysplasia; feature selection; machine learning; premature infants; risk identification.