Identification of diagnostic gene biomarkers and immune infiltration in patients with diabetic kidney disease using machine learning strategies and bioinformatic analysis

Front Med (Lausanne). 2022 Sep 29:9:918657. doi: 10.3389/fmed.2022.918657. eCollection 2022.

Abstract

Objective: Diabetic kidney disease (DKD) is the leading cause of chronic kidney disease and end-stage renal disease worldwide. Early diagnosis is critical to prevent its progression. The aim of this study was to identify potential diagnostic biomarkers for DKD, illustrate the biological processes related to the biomarkers and investigate the relationship between them and immune cell infiltration.

Materials and methods: Gene expression profiles (GSE30528, GSE96804, and GSE99339) for samples obtained from DKD and controls were downloaded from the Gene Expression Omnibus database as a training set, and the gene expression profiles (GSE47185 and GSE30122) were downloaded as a validation set. Differentially expressed genes (DEGs) were identified using the training set, and functional correlation analyses were performed. The least absolute shrinkage and selection operator (LASSO), support vector machine-recursive feature elimination (SVM-RFE), and random forests (RF) were performed to identify potential diagnostic biomarkers. To evaluate the diagnostic efficacy of these potential biomarkers, receiver operating characteristic (ROC) curves were plotted separately for the training and validation sets, and immunohistochemical (IHC) staining for biomarkers was performed in the DKD and control kidney tissues. In addition, the CIBERSORT, XCELL and TIMER algorithms were employed to assess the infiltration of immune cells in DKD, and the relationships between the biomarkers and infiltrating immune cells were also investigated.

Results: A total of 95 DEGs were identified. Using three machine learning algorithms, DUSP1 and PRKAR2B were identified as potential biomarker genes for the diagnosis of DKD. The diagnostic efficacy of DUSP1 and PRKAR2B was assessed using the areas under the curves in the ROC analysis of the training set (0.945 and 0.932, respectively) and validation set (0.789 and 0.709, respectively). IHC staining suggested that the expression levels of DUSP1 and PRKAR2B were significantly lower in DKD patients compared to normal. Immune cell infiltration analysis showed that B memory cells, gamma delta T cells, macrophages, and neutrophils may be involved in the development of DKD. Furthermore, both of the candidate genes are associated with these immune cell subtypes to varying extents.

Conclusion: DUSP1 and PRKAR2B are potential diagnostic markers of DKD, and they are closely associated with immune cell infiltration.

Keywords: bioinformatic analysis; diabetic kidney disease; diagnostic biomarker; immune infiltration; machine learning strategy.