An integrated machine learning framework for developing and validating a diagnostic model of major depressive disorder based on interstitial cystitis-related genes

J Affect Disord. 2024 May 14:359:22-32. doi: 10.1016/j.jad.2024.05.061. Online ahead of print.

Abstract

Background: Major depressive disorder (MDD) and interstitial cystitis (IC) are two highly debilitating conditions that often coexist with reciprocal effect, significantly exacerbating patients' suffering. However, the molecular underpinnings linking these disorders remain poorly understood.

Methods: Transcriptomic data from GEO datasets including those of MDD and IC patients was systematically analyzed to develop and validate our model. Following removal of batch effect, differentially expressed genes (DEGs) between respective disease and control groups were identified. Shared DEGs of the conditions then underwent functional enrichment analyses. Additionally, immune infiltration analysis was quantified through ssGSEA. A diagnostic model for MDD was constructed by exploring 113 combinations of 12 machine learning algorithms with 10-fold cross-validation on the training sets following by external validation on test sets. Finally, the "Enrichr" platform was utilized to identify potential drugs for MDD.

Results: Totally, 21 key genes closely associated with both MDD and IC were identified, predominantly involved in immune processes based on enrichment analyses. Immune infiltration analysis revealed distinct profiles of immune cell infiltration in MDD and IC compared to healthy controls. From these genes, a robust 11-gene (ABCD2, ATP8B4, TNNT1, AKR1C3, SLC26A8, S100A12, PTX3, FAM3B, ITGA2B, OLFM4, BCL7A) diagnostic signature was constructed, which exhibited superior performance over existing MDD diagnostic models both in training and testing cohorts. Additionally, epigallocatechin gallate and 10 other drugs emerged as potential targets for MDD.

Conclusion: Our work developed a diagnostic model for MDD employing a combination of bioinformatic techniques and machine learning methods, focusing on shared genes between MDD and IC.

Keywords: Bioinformatics analysis; Diagnostic model; Interstitial cystitis; Machine learning; Major depressive disorder.