Identification and Validation of Potential Pathogenic Genes and Prognostic Markers in ESCC by Integrated Bioinformatics Analysis

Front Genet. 2020 Dec 10:11:521004. doi: 10.3389/fgene.2020.521004. eCollection 2020.

Abstract

Esophageal squamous cell carcinoma (ESCC) is one of the most fatal malignancies of the digestive tract, but its underlying molecular mechanisms are not known. We aim to identify the genes involved in ESCC carcinogenesis and discover potential prognostic markers using integrated bioinformatics analysis. Three pairs of ESCC tissues and paired normal tissues were sequenced by high-throughput RNA sequencing (RNA-seq). Integrated bioinformatics analysis was used to identify differentially expressed coding genes (DECGs) and differentially expressed long non-coding RNA (lncRNA) genes (DELGs). A protein-protein interaction (PPI) network of DECGs was established using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) website and visualized with Cytoscape. Survival analysis was conducted by log-rank tests to identify "hub" genes with potential prognostic value, and real-time reverse transcription-quantitative polymerase chain reaction (RT-qPCR) was conducted to assess expression of these genes in ESCC tissues. TranswellTM assays were employed to examine the migration ability of cells after knockdown of LINC01614 expression, followed by investigation of epithelial-mesenchymal transition (EMT) by western blotting (WB). A total of 106 upregulated genes and 42 downregulated genes were screened out from the ESCC data sets. Survival analysis showed two hub protein-coding genes with higher expression in module 1 of the PPI network (SPP1 and BGN) and another three upregulated lncRNAs (LINC01614, LINC01415, NKILA) that were associated with a poor prognosis. High expression of SPP1, BGN, LINC01614, and LINC01415 in tumor samples was validated further by RT-qPCR. In vitro experiments show that knockdown of LINC01614 expression could significantly inhibit the migration of ESCC cells by regulating EMT, which was confirmed by WB. These results indicate that BGN, SPP1, LINC01614, and LINC01415 might be critical genes in ESCC and potential prognostic biomarkers.

Keywords: RNA-seq; bioinformatics analysis; esophageal squamous cell carcinoma; expression; long non-coding RNA; next-generation sequencing.