Machine Learning Model Based on Lipidomic Profile Information to Predict Sudden Infant Death Syndrome

Healthcare (Basel). 2022 Jul 14;10(7):1303. doi: 10.3390/healthcare10071303.

Abstract

Sudden infant death syndrome (SIDS) represents the leading cause of death in under one year of age in developing countries. Even in our century, its etiology is not clear, and there is no biomarker that is discriminative enough to predict the risk of suffering from it. Therefore, in this work, taking a public dataset on the lipidomic profile of babies who died from this syndrome compared to a control group, a univariate analysis was performed using the Mann-Whitney U test, with the aim of identifying the characteristics that enable discriminating between both groups. Those characteristics with a p-value less than or equal to 0.05 were taken; once these characteristics were obtained, classification models were implemented (random forests (RF), logistic regression (LR), support vector machine (SVM) and naive Bayes (NB)). We used seventy percent of the data for model training, subjecting it to a cross-validation (k = 5) and later submitting to validation in a blind test with 30% of the remaining data, which allows simulating the scenario in real life-that is, with an unknown population for the model. The model with the best performance was RF, since in the blind test, it obtained an AUC of 0.9, specificity of 1, and sensitivity of 0.8. The proposed model provides the basis for the construction of a SIDS risk prediction computer tool, which will contribute to prevention, and proposes lines of research to deal with this pathology.

Keywords: SIDS; biomarker; glycerophospholipids; lipidomic; machine learning; metabolomic.

Grants and funding

This research received no external funding.