Screening of potential biomarkers in peripheral blood of patients with depression based on weighted gene co-expression network analysis and machine learning algorithms

Front Psychiatry. 2022 Oct 17:13:1009911. doi: 10.3389/fpsyt.2022.1009911. eCollection 2022.

Abstract

Background: The prevalence of depression has been increasing worldwide in recent years, posing a heavy burden on patients and society. However, the diagnostic and therapeutic tools available for this disease are inadequate. Therefore, this research focused on the identification of potential biomarkers in the peripheral blood of patients with depression.

Methods: The expression dataset GSE98793 of depression was provided by the Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/gds). Initially, differentially expressed genes (DEGs) were detected in GSE98793. Subsequently, the most relevant modules for depression were screened according to weighted gene co-expression network analysis (WGCNA). Finally, the identified DEGs were mapped to the WGCNA module genes to obtain the intersection genes. In addition, Gene Ontology (GO), Disease Ontology (DO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) functional enrichment analyses were conducted on these genes. Moreover, biomarker screening was carried out by protein-protein interaction (PPI) network construction of intersection genes on the basis of various machine learning algorithms. Furthermore, the gene set enrichment analysis (GSEA), immune function analysis, transcription factor (TF) analysis, and the prediction of the regulatory mechanism were collectively performed on the identified biomarkers. In addition, we also estimated the clinical diagnostic ability of the obtained biomarkers, and performed Mfuzz expression pattern clustering and functional enrichment of the most potential biomarkers to explore their regulatory mechanisms. Finally, we also perform biomarker-related drug prediction.

Results: Differential analysis was used for obtaining a total of 550 DEGs and WGCNA for obtaining 1,194 significant genes. Intersection analysis of the two yielded 140 intersection genes. Biological functional analysis indicated that these genes had a major role in inflammation-related bacterial infection pathways and cardiovascular diseases such as atherosclerosis. Subsequently, the genes S100A12, SERPINB2, TIGIT, GRB10, and LHFPL2 in peripheral serum were identified as depression biomarkers by using machine learning algorithms. Among them, S100A12 is the most valuable biomarker for clinical diagnosis. Finally, antidepressants, including disodium selenite and eplerenone, were predicted.

Conclusion: The genes S100A12, TIGIT, SERPINB2, GRB10, and LHFPL2 in peripheral serum are viable diagnostic biomarkers for depression. and contribute to the diagnosis and prevention of depression in clinical practice.

Keywords: biomarkers; blood; depression; machine learning; weighted correlation network analysis.