Bioinformatic Analysis and Machine Learning Methods in Neonatal Sepsis: Identification of Biomarkers and Immune Infiltration

Biomedicines. 2023 Jun 28;11(7):1853. doi: 10.3390/biomedicines11071853.

Abstract

The disease neonatal sepsis (NS) poses a serious threat to life, and its pathogenesis remains unclear. Using the Gene Expression Omnibus (GEO) database, differentially expressed genes (DEGs) were identified and functional enrichment analyses were conducted. Three machine learning algorithms containing the least absolute shrinkage and selection operator (LASSO), support vector machine recursive feature elimination (SVM-RFE), and random forest (RF) were applied to identify the optimal feature genes (OFGs). This study conducted CIBERSORT to present the abundance of immune infiltrates between septic and control neonates and assessed the relationship between OFGs and immune cells. In total, 44 DEGs were discovered between the septic and control newborns. Throughout the enrichment analysis, DEGs were primarily related to inflammatory signaling pathways and immune responses. The OFGs derived from machine learning algorithms were intersected to yield four biomarkers, namely Hexokinase 3 (HK3), Cystatin 7 (CST7), Resistin (RETN), and Glycogenin 1 (GYG1). The potential biomarkers were validated in other datasets and LPS-stimulated HEUVCs. Septic infants showed a higher proportion of neutrophils (p < 0.001), M0 macrophages (p < 0.001), and regulatory T cells (p = 0.004). HK3, CST7, RETN, and GYG1 showed significant correlations with immune cells. Overall, the biomarkers offered promising insights into the molecular mechanisms of immune regulation for the prediction and treatment of NS.

Keywords: GEO database; biomarkers; immune infiltration; machine learning; neonatal sepsis.

Grants and funding

This research received no external funding.