Integrating Genome-Wide Association Studies and Gene Expression Profiles With Chemical-Genes Interaction Networks to Identify Chemicals Associated With Colorectal Cancer

Front Genet. 2020 Apr 24:11:385. doi: 10.3389/fgene.2020.00385. eCollection 2020.

Abstract

Colorectal cancer (CRC) is the third most common cancer and has the second highest mortality rate in global cancer. Exploring the associations between chemicals and CRC has great significance in prophylaxis and therapy of tumor diseases. This study aims to explore the relationships between CRC and environmental chemicals on genetic basis by bioinformatics analysis. The genome-wide association study (GWAS) datasets for CRC were obtained from the UK Biobank. The GWAS data for colon cancer (category C18) includes 2,581 individuals and 449,683 controls, while that of rectal cancer (category C20) includes 1,244 individuals and 451,020 controls. In addition, we derived CRC gene expression datasets from the NCBI-GEO (GSE106582). The chemicals related gene sets were acquired from the comparative toxicogenomics database (CTD). Transcriptome-wide association study (TWAS) analysis was applied to CRC GWAS summary data and calculated the expression association testing statistics by FUSION software. We performed chemicals related gene set enrichment analysis (GSEA) by integrating GWAS summary data, mRNA expression profiles of CRC and the CTD chemical-gene interaction networks to identify relationships between chemicals and genes of CRC. We observed several significant correlations between chemicals and CRC. Meanwhile, we also detected 5 common chemicals between colon and rectal cancer, including methylnitronitrosoguanidine, isoniazid, PD 0325901, sulindac sulfide, and importazole. Our study performed TWAS and GSEA analysis, linked prior knowledge to newly generated data and thereby helped identifying chemicals related to tumor genes, which provides new clues for revealing the associations between environmental chemicals and cancer.

Keywords: colorectal cancer; comparative toxicogenomics database; gene set enrichment analysis; genome-wide association study; transcriptome-wide association study.