Meta-Analysis of Esophageal Cancer Transcriptomes Using Independent Component Analysis

Front Genet. 2021 Oct 21:12:683632. doi: 10.3389/fgene.2021.683632. eCollection 2021.

Abstract

Independent Component Analysis is a matrix factorization method for data dimension reduction. ICA has been widely applied for the analysis of transcriptomic data for blind separation of biological, environmental, and technical factors affecting gene expression. The study aimed to analyze the publicly available esophageal cancer data using the ICA for identification and comprehensive analysis of reproducible signaling pathways and molecular signatures involved in this cancer type. In this study, four independent esophageal cancer transcriptomic datasets from GEO databases were used. A bioinformatics tool « BiODICA-Independent Component Analysis of Big Omics Data» was applied to compute independent components (ICs). Gene Set Enrichment Analysis (GSEA) and ToppGene uncovered the most significantly enriched pathways. Construction and visualization of gene networks and graphs were performed using the Cytoscape, and HPRD database. The correlation graph between decompositions into 30 ICs was built with absolute correlation values exceeding 0.3. Clusters of components-pseudocliques were observed in the structure of the correlation graph. The top 1,000 most contributing genes of each ICs in the pseudocliques were mapped to the PPI network to construct associated signaling pathways. Some cliques were composed of densely interconnected nodes and included components common to most cancer types (such as cell cycle and extracellular matrix signals), while others were specific to EC. The results of this investigation may reveal potential biomarkers of esophageal carcinogenesis, functional subsystems dysregulated in the tumor cells, and be helpful in predicting the early development of a tumor.

Keywords: esophageal cancer; genomics; independent component analysis; meta-analysis; transcriptomics.

Grants and funding

The present study was supported by the research grants of the Ministry of Education and Science of the Republic of Kazakhstan (AP09058660), CRP NU grant 021220CRP222 “Identification of a long non-coding RNA (lncRNA) and microRNA in ESCC”, by the Ministry of Science and Higher Education of the Russian Federation (Project No. 075-15-2021-634) and by the French government under management of Agence Nationale de la Recherche as part of the “Investissements d’Avenir” program, reference ANR-19-P3IA-0001 (PRAIRIE 3IA Institute).