Systematical analysis of underlying markers associated with Marfan syndrome via integrated bioinformatics and machine learning strategies

J Biomol Struct Dyn. 2023 Jul 14:1-12. doi: 10.1080/07391102.2023.2233021. Online ahead of print.

Abstract

Marfan syndrome (MFS) is a hereditary disease with high mortality. This study aimed to explore peripheral blood potential markers and underlying mechanisms in MFS via a series bioinformatics and machine learning analysis. First, we downloaded two MFS datasets from the GEO database. A total of 215 differentially expressed genes (DEGs) and 78 differentially expressed miRNAs (DEMs) were identified via "Limma" package. 60 DEGs, mainly enriched in abnormal transportation of structure and energy substances, were selected after protein-protein interaction (PPI) network construction, of which 20 were chosen for machine learning after three algorithms (betweenness, closeness, and degree) filtration using Cytoscape. Four overlapping DEGs (ACTN1, CFTR, GCKR, LAMA3) were finally selected as the candidate markers based on three machine-learning approaches (Lasso, random forest, and support vector machine-recursive feature elimination). Furthermore, we collected peripheral blood from MFS patients and healthy control to validate the findings and the results showed that compared with the control, the expression of the four DEGs was all statistically different in MFS patients validated by qRT-PCR. Besides, the area under the receiver operating characteristics curve was greater than 0.8 for each DEG. Single-sample gene-set enrichment analysis showed that the four DEGs were strongly associated with inflammation and myogenesis pathway. Finally, we constructed the mRNA-miRNA network based on the intersection of DEMs and predicted miRNAs targeting DEGs. In conclusion, our study partially provided four potential markers for MFS pathogenesis.Communicated by Ramaswamy H. Sarma.

Keywords: Marfan syndrome; mRNA-miRNA network; machine learning; marker.