Integrative analysis of histopathological images and chromatin accessibility data for estrogen receptor-positive breast cancer

BMC Med Genomics. 2020 Dec 28;13(Suppl 11):195. doi: 10.1186/s12920-020-00828-4.

Abstract

Background: Existing studies have demonstrated that the integrative analysis of histopathological images and genomic data can be used to better understand the onset and progression of many diseases, as well as identify new diagnostic and prognostic biomarkers. However, since the development of pathological phenotypes are influenced by a variety of complex biological processes, complete understanding of the underlying gene regulatory mechanisms for the cell and tissue morphology is still a challenge. In this study, we explored the relationship between the chromatin accessibility changes and the epithelial tissue proportion in histopathological images of estrogen receptor (ER) positive breast cancer.

Methods: An established whole slide image processing pipeline based on deep learning was used to perform global segmentation of epithelial and stromal tissues. We then used canonical correlation analysis to detect the epithelial tissue proportion-associated regulatory regions. By integrating ATAC-seq data with matched RNA-seq data, we found the potential target genes that associated with these regulatory regions. Then we used these genes to perform the following pathway and survival analysis.

Results: Using canonical correlation analysis, we detected 436 potential regulatory regions that exhibited significant correlation between quantitative chromatin accessibility changes and the epithelial tissue proportion in tumors from 54 patients (FDR < 0.05). We then found that these 436 regulatory regions were associated with 74 potential target genes. After functional enrichment analysis, we observed that these potential target genes were enriched in cancer-associated pathways. We further demonstrated that using the gene expression signals and the epithelial tissue proportion extracted from this integration framework could stratify patient prognoses more accurately, outperforming predictions based on only omics or image features.

Conclusion: This integrative analysis is a useful strategy for identifying potential regulatory regions in the human genome that are associated with tumor tissue quantification. This study will enable efficient prioritization of genomic regulatory regions identified by ATAC-seq data for further studies to validate their causal regulatory function. Ultimately, identifying epithelial tissue proportion-associated regulatory regions will further our understanding of the underlying molecular mechanisms of disease and inform the development of potential therapeutic targets.

Keywords: ATAC-seq; Bioinformatics; Chromatin accessibility data; Computational biology; Histopathological images; Integrative analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Biomarkers, Tumor / genetics
  • Biomarkers, Tumor / metabolism*
  • Breast Neoplasms / diagnostic imaging
  • Breast Neoplasms / genetics*
  • Breast Neoplasms / metabolism
  • Breast Neoplasms / pathology*
  • Chromatin / genetics*
  • Computational Biology / methods
  • Estrogen Receptor alpha / genetics
  • Estrogen Receptor alpha / metabolism*
  • Female
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Middle Aged
  • Molecular Imaging / methods*
  • Prognosis
  • Promoter Regions, Genetic
  • Regulatory Elements, Transcriptional
  • Survival Rate

Substances

  • Biomarkers, Tumor
  • Chromatin
  • ESR1 protein, human
  • Estrogen Receptor alpha