Identification and verification of key cancer genes associated with prognosis of colorectal cancer based on bioinformatics analysis

Zhong Nan Da Xue Xue Bao Yi Xue Ban. 2021 Oct 28;46(10):1063-1070. doi: 10.11817/j.issn.1672-7347.2021.200952.
[Article in English, Chinese]

Abstract

Objectives: The biomarkers targeting colorectal cancer (CRC) prognosis are short of high accuracy and sensitivity in clinic. Through bioinformatics analysis, we aim to identify and confirm a series of key genes referred to the diagnosis and prognosis of CRC.

Methods: GSE31905, GSE35279, and GSE41657 were selected as complete RNA sequencing data sets of CRC and colorectal mucosa (CRM) tissues from the NCBI-GEO database, and the differentially expressed genes (DEGs) were analyzed. The common DEGs in these 3 data sets were obtained by Venn map, and enriched by STRING network system and Cytoscape software. The Kaplan-Meier plotter website was used to verify the correlation between the enriched genes and the prognosis of CRC.

Results: For the whole RNA sequencing data sets of CRC and normal intestinal mucosa samples, the DEGs of CRC and CRM in the 3 data sets (|log2FC|>2 and P<0.05) were screened by GEO2R tool in NCBI-GEO database. By using Venn graph analysis software, the intersection of up-regulated/down-regulated genes in 3 GSE datasets was obtained, and a total 105 up-regulated genes and 140 down-regulated genes were found in the 3 samples. The up-regulated/down-regulated genes were introduced into the STRING network system to obtain the interacting genes. The interacting gene sets were introduced into Cytoscape software, and 61 up-regulated genes were found by Molecular Complex Detection (MCODE) plug-in. Through the Kaplan-Meier plotter website, we found that EPHB2, KLK8, DIAPH3, STC2, OXTR, MMP7, MET, KRT85, KRT6B, KRT23, and KLK10 genes were highly expressed in CRC, and were related to the prognosis.

Conclusions: The above 11 genes verified by bioinformatics retrieval and analysis can predict the poor prognosis of CRC to a certain extent, and they provide a possible target for the diagnosis and treatment of CRC.

目的: 针对结直肠癌(colorectal cancer,CRC)预后的肿瘤标志物尚缺乏可靠的准确度和灵敏度。本研究旨在采用生物信息学方法,筛选并验证一组与CRC诊断与预后相关的基因。方法: 从NCBI-GEO数据库中选取关于CRC和正常结直肠黏膜(colorectal mucosa,CRM)组织标本全RNA组测序数据集GSE31905、GSE35279和GSE41657,并分析其中的差异表达基因(differentially expressed genes,DEGs),先通过Venn图获取这3个数据集的共同DEGs,然后用STRING网络系统和Cytoscape软件进一步富集上述基因,最后在Kaplan-Meier plotter网站上验证富集后的基因与CRC预后的相关性。结果: 通过NCBI-GEO数据库中自带的GEO2R工具分别筛选出3个数据集中CRC与CRM的DEGs(|log2FC|>2和P<0.05)。用Venn图分析软件,将3个数据集中的上调/下调基因分别取交集,得出3个样本共105个共有的上调基因和140个共有的下调基因。将上述上调/下调基因导入STRING网络系统,得出相互作用的基因。将相互作用的基因集导入Cytoscape软件,用MCODE(Molecular Complex Detection)插件查找到61个上调基因。最后通过Kaplan-Meier plotter网站,得到EPHB2、KLK8、DIAPH3、STC2、OXTR、MMP7、MET、KRT85、KRT6B、KRT23、KLK10等11个在CRC中高表达的基因,且与预后相关。结论: 通过生物信息学分析和验证,上述11个基因可在一定程度上预测CRC的不良预后,为CRC的诊断、治疗和预后提供可能的靶标。.

Keywords: RNA sequencing; bioinformatics analysis; colorectal cancer; different expressed genes.

MeSH terms

  • Biomarkers, Tumor / genetics
  • Biomarkers, Tumor / metabolism
  • Colorectal Neoplasms* / genetics
  • Computational Biology*
  • Formins
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic
  • Glycoproteins
  • Humans
  • Intercellular Signaling Peptides and Proteins
  • Oncogenes
  • Prognosis
  • Protein Interaction Maps

Substances

  • Biomarkers, Tumor
  • DIAPH3 protein, human
  • Formins
  • Glycoproteins
  • Intercellular Signaling Peptides and Proteins
  • STC2 protein, human