Screening and identification of hub gene and differential gene and mutation sequence analysis of related genes in colorectal cancer based on bioinformatics analysis

J Gastrointest Oncol. 2022 Dec;13(6):3056-3066. doi: 10.21037/jgo-22-1131.

Abstract

Background: At present, the research of genomics is in ascendency, and using bioinformatics analysis methods to systematically explore the pathogenic genes and their regulatory mechanisms will play a great role in promoting the research of cancer. This study was to search The Cancer Genome Atlas (TCGA) database and extract inflammation-related non-coding RNA to construct a prognosis model of colon cancer and search for new immunotherapeutic targets.

Methods: The transcriptome sequencing data and clinical data of 396 colon cancer patients were downloaded from TCGA database, and the inflammation-related non-coding RNA was obtained from the non-coding RNAs in Inflammation (ncRI) database. The prognostic model was constructed by univariate Cox regression, least absolute shrinkage and selection operator (LASSO) regression, and multivariate Cox regression, and the optimal grouping threshold of risk score was determined by X-Tile software. The patients were risk stratified to further explore the differences in immune cell infiltration and biological function between the high- and low-risk groups.

Results: The TCGA dataset of colon cancer was included to screen out 120 differentially expressed genes (DEGs) that overlapped in the 2 datasets, among which 29 genes were up-regulated and 91 genes were down-regulated. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the above 120 DEGs showed that proximal tubule sodium bicarbonate recovery, nitrogen metabolism, pancreatic fluid secretion, and PPAR signaling pathways were closely related to the occurrence of colon cancer. The expression of copper death-related genes was significantly correlated with the correlation coefficient of colon cancer (P<0.01). Gene Ontology analysis showed that the DEGs were mainly enriched in messenger RNA processing, RNA splicing, small G protein-mediated signal transduction, adhesion junction, mitochondrial matrix, mitochondrial protein complex, chromatin binding, small G protein binding, and Ras G protein binding, among others. KEGG analysis showed that the DEGs were enriched in the following pathways: herpes simplex virus type 1 infection, pathways of neurodegenerative diseases, Huntington's disease, prion disease, Parkinson's disease, the Ras signaling pathway, and so on.

Conclusions: The key genes closely related to colon cancer were effectively screened by the bioinformatics method, which provided a theoretical basis for further study of its mechanism.

Keywords: Colon cancer; bioinformatics analysis; immune infiltration; inflammation-related noncoding RNA; prognostic model.