Identification of candidate biomarkers associated with gastric cancer prognosis based on an integrated bioinformatics analysis

J Gastrointest Oncol. 2022 Aug;13(4):1690-1700. doi: 10.21037/jgo-22-651.

Abstract

Background: This study sought to identify candidate biomarkers associated with gastric cancer (GC) prognosis based on an integrated bioinformatics analysis.

Methods: First, the GSE54129 and GSE79973 data sets were downloaded from the Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) identified between the 2 data sets were screened using the limma software package in R, and the intersection DEGs were obtained by a Venn analysis. Subsequently, gene clustering and a functional analysis were performed to explore the roles of the DEGs. The protein-protein interaction (PPI) network of the genes in clusters was constructed using the Search Tool for the Retrieval of Interacting Genes/Proteins. A survival analysis evaluated the associations between the candidate genes and the overall survival of GC patients. A drug-gene interaction analysis and an external data set analysis were conducted using The Cancer Genome Atlas-Stomach Adenocarcinoma (TCGA-STAD) data set to validate the prognostic genes.

Results: We extracted 421 intersection DEGs from the 2 GEO data sets. There were 5 gene clusters, and the functional analysis revealed that they were mainly associated with the extracellular matrix-receptor interaction pathway. The PPI interaction analysis identified the top 36 hub genes. The survival analysis revealed that 7 upregulated genes [i.e., platelet-derived growth factor receptor beta (PDGFRB), angiopoietin 2 (ANGPT2), vascular endothelial growth factor C (VEGFC), collagen type IV alpha 2 chain (COL4A2), collagen type IV alpha 1 chain (COL4A1), thrombospondin 1 (THBS1), and fibronectin 1 (FN1)] were associated with the survival prognosis of GC patients. The 20 drug-gene interaction pairs among the 4 genes and 18 drugs were obtained. Finally, TCGA-STAD data set was used to validate the expression levels of COL4A1, PDGFRB, and FN1.

Conclusions: We found that 7 upregulated genes (i.e., PDGFRB, ANGPT2, VEGFC, COL4A2, COL4A1, THBS1, and FN1) were promising markers of prognosis in GC patients.

Keywords: Gastric cancer (GC); differentially expressed gene; protein-protein interaction analysis; survival analysis.