Development and interpretation of a pathomics-based model for the prediction of microsatellite instability in Colorectal Cancer

Theranostics. 2020 Sep 2;10(24):11080-11091. doi: 10.7150/thno.49864. eCollection 2020.

Abstract

Microsatellite instability (MSI) has been approved as a pan-cancer biomarker for immune checkpoint blockade (ICB) therapy. However, current MSI identification methods are not available for all patients. We proposed an ensemble multiple instance deep learning model to predict microsatellite status based on histopathology images, and interpreted the pathomics-based model with multi-omics correlation. Methods: Two cohorts of patients were collected, including 429 from The Cancer Genome Atlas (TCGA-COAD) and 785 from an Asian colorectal cancer (CRC) cohort (Asian-CRC). We established the pathomics model, named Ensembled Patch Likelihood Aggregation (EPLA), based on two consecutive stages: patch-level prediction and WSI-level prediction. The initial model was developed and validated in TCGA-COAD, and then generalized in Asian-CRC through transfer learning. The pathological signatures extracted from the model were analyzed with genomic and transcriptomic profiles for model interpretation. Results: The EPLA model achieved an area-under-the-curve (AUC) of 0.8848 (95% CI: 0.8185-0.9512) in the TCGA-COAD test set and an AUC of 0.8504 (95% CI: 0.7591-0.9323) in the external validation set Asian-CRC after transfer learning. Notably, EPLA captured the relationship between pathological phenotype of poor differentiation and MSI (P < 0.001). Furthermore, the five pathological imaging signatures identified from the EPLA model were associated with mutation burden and DNA damage repair related genotype in the genomic profiles, and antitumor immunity activated pathway in the transcriptomic profiles. Conclusions: Our pathomics-based deep learning model can effectively predict MSI from histopathology images and is transferable to a new patient cohort. The interpretability of our model by association with pathological, genomic and transcriptomic phenotypes lays the foundation for prospective clinical trials of the application of this artificial intelligence (AI) platform in ICB therapy.

Keywords: colorectal cancer; ensembled patch likelihood aggregation (EPLA); microsatellite instability; multi-omics; pathomics.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Cohort Studies
  • Colon / pathology
  • Colorectal Neoplasms / drug therapy
  • Colorectal Neoplasms / genetics*
  • Colorectal Neoplasms / immunology
  • Colorectal Neoplasms / pathology
  • DNA Damage
  • DNA Repair
  • Datasets as Topic
  • Deep Learning
  • Drug Resistance, Neoplasm / genetics
  • Gene Expression Profiling
  • Genomics / methods
  • Humans
  • Image Interpretation, Computer-Assisted / methods*
  • Immune Checkpoint Inhibitors / pharmacology*
  • Immune Checkpoint Inhibitors / therapeutic use
  • Microsatellite Instability*
  • Models, Genetic
  • ROC Curve
  • Rectum / pathology

Substances

  • Biomarkers, Tumor
  • Immune Checkpoint Inhibitors