Clinical actionability of triaging DNA mismatch repair deficient colorectal cancer from biopsy samples using deep learning

EBioMedicine. 2022 Jul:81:104120. doi: 10.1016/j.ebiom.2022.104120. Epub 2022 Jun 23.

Abstract

Background: We aimed to develop a deep learning (DL) model to predict DNA mismatch repair (MMR) status in colorectal cancers (CRC) based on hematoxylin and eosin-stained whole-slide images (WSIs) and assess its clinical applicability.

Methods: The DL model was developed and validated through three-fold cross validation using 441 WSIs from the Cancer Genome Atlas (TCGA) and externally validated using 78 WSIs from the Pathology AI Platform (PAIP), and 355 WSIs from surgical specimens and 341 WSIs from biopsy specimens of the Sun Yet-sun University Cancer Center (SYSUCC). Domain adaption and multiple instance learning (MIL) techniques were adopted for model development. The performance of the models was evaluated using the area under the receiver operating characteristic curve (AUROC). A dual-threshold strategy was also built from the surgical cohorts and validated in the biopsy cohort. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F1-score, and the percentage of patients avoiding IHC testing were evaluated.

Findings: The MIL model achieved an AUROC of 0·8888±0·0357 in the TCGA-validation cohort, 0·8806±0·0232 in the PAIP cohort, 0·8457±0·0233 in the SYSUCC-surgical cohort, and 0·7679±0·0342 in the SYSUCC-biopsy cohort. A dual-threshold triage strategy was used to rule-in and rule-out dMMR patients with remaining uncertain patients recommended for further IHC testing, which kept sensitivity higher than 90% and specificity higher than 95% on deficient MMR patient triage from both the surgical and biopsy specimens, result in more than half of patients avoiding IHC based MMR testing.

Interpretation: A DL-based method that could directly predict CRC MMR status from WSIs was successfully developed, and a dual-threshold triage strategy was established to minimize the number of patients for further IHC testing.

Funding: The study was funded by the National Natural Science Foundation of China (82073159, 81871971 and 81700576), the Natural Science Foundation of Guangdong Province (No. 2021A1515011792 and No.2022A1515012403) and Medical Scientific Research Foundation of Guangdong Province of China (No. A2020392).

Keywords: Colorectal cancer; Deep learning; Dual-threshold; Mismatch repair-deficient; Screening strategy.

MeSH terms

  • Biopsy
  • Colorectal Neoplasms* / diagnosis
  • Colorectal Neoplasms* / genetics
  • Colorectal Neoplasms* / pathology
  • DNA Mismatch Repair / genetics
  • Deep Learning*
  • Humans
  • Triage