A Multi-Source Data Fusion Framework for Revealing the Regulatory Mechanism of Breast Cancer Immune Evasion

Front Genet. 2020 Nov 12:11:595324. doi: 10.3389/fgene.2020.595324. eCollection 2020.

Abstract

For precision medicine, there is an enormous need to understand the immune evasion mechanism of tumor development, especially when tumor heterogeneity significantly affects the effect of immunotherapy. Recognizing the subtypes of breast cancer based on the immune-related genes helps to understand the immune escape pathways dominated by different subtypes, so as to implement effective treatment measures for different subtypes. For that, we used non-negative matrix factorization and consistent clustering algorithm on The Cancer Genome Atlas RNA-seq breast cancer data and recognized 4 subtypes according to the curated immune-related genes. Then, we conducted differential expression analysis between each subtype of breast cancer and normal tissue of RNA-seq data from non-cancer individuals collected by the Genotype-Tissue Expression to find out subtype-related immune genes. After that, we carried out correlation analysis between copy number variants (CNV) and mRNA of immune genes and investigated the regulatory mechanism of the immune genes, which cannot be explained by CNV based on ATAC-seq data. The experimental results reveal that CDH1 and PVRL2 are potential for immune evasion in all 4 subgroups. The expression variations of CDH1 can be mainly explained by its CNV, while the expression variation of PVRL2 is more likely regulated by transcript factors.

Keywords: complex diseases; correlation analysis; data fusion; data mining; immune evasion.