Identification of differentially expressed genes and biological characteristics of colorectal cancer by integrated bioinformatics analysis

J Cell Physiol. 2019 Sep;234(9):15215-15224. doi: 10.1002/jcp.28163. Epub 2019 Jan 16.

Abstract

Colorectal cancer (CRC) ranks as one of the most common malignant tumors worldwide. Its mortality rate has remained high in recent years. Therefore, the aim of this study was to identify significant differentially expressed genes (DEGs) involved in its pathogenesis, which may be used as novel biomarkers or potential therapeutic targets for CRC. The gene expression profiles of GSE21510, GSE32323, GSE89076, and GSE113513 were downloaded from the Gene Expression Omnibus (GEO) database. After screening DEGs in each GEO data set, we further used the robust rank aggregation method to identify 494 significant DEGs including 212 upregulated and 282 downregulated genes. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed by DAVID and the KOBAS online database, respectively. These DEGs were shown to be significantly enriched in different cancer-related functions and pathways. Then, the STRING database was used to construct the protein-protein interaction network. The module analysis was performed by the MCODE plug-in of Cytoscape based on the whole network. We finally filtered out seven hub genes by the cytoHubba plug-in, including PPBP, CCL28, CXCL12, INSL5, CXCL3, CXCL10, and CXCL11. The expression validation and survival analysis of these hub genes were analyzed based on The Cancer Genome Atlas database. In conclusion, the robust DEGs associated with the carcinogenesis of CRC were screened through the GEO database, and integrated bioinformatics analysis was conducted. Our study provides reliable molecular biomarkers for screening and diagnosis, prognosis as well as novel therapeutic targets for CRC.

Keywords: GEO; bioinformatics; colorectal cancer; differentially expressed genes; robust rank aggregation.