A Joint Model of Random Forest and Artificial Neural Network for the Diagnosis of Endometriosis

Front Genet. 2022 Mar 8:13:848116. doi: 10.3389/fgene.2022.848116. eCollection 2022.

Abstract

Endometriosis (EM), an estrogen-dependent inflammatory disease with unknown etiology, affects thousands of childbearing-age couples, and its early diagnosis is still very difficult. With the rapid development of sequencing technology in recent years, the accumulation of many sequencing data makes it possible to screen important diagnostic biomarkers from some EM-related genes. In this study, we utilized public datasets in the Gene Expression Omnibus (GEO) and Array-Express database and identified seven important differentially expressed genes (DEGs) (COMT, NAA16, CCDC22, EIF3E, AHI1, DMXL2, and CISD3) through the random forest classifier. Among these DEGs, AHI1, DMXL2, and CISD3 have never been reported to be associated with the pathogenesis of EMs. Our study indicated that these three genes might participate in the pathogenesis of EMs through oxidative stress, epithelial-mesenchymal transition (EMT) with the activation of the Notch signaling pathway, and mitochondrial homeostasis, respectively. Then, we put these seven DEGs into an artificial neural network to construct a novel diagnostic model for EMs and verified its diagnostic efficacy in two public datasets. Furthermore, these seven DEGs were included in 15 hub genes identified from the constructed protein-protein interaction (PPI) network, which confirmed the reliability of the diagnostic model. We hope the diagnostic model can provide novel sights into the understanding of the pathogenesis of EMs and contribute to the clinical diagnosis and treatment of EMs.

Keywords: artificial neural network; diagnostic efficacy; diagnostic model; endometriosis; random forest.