Comparison of the Diagnostic Accuracy of Mammogram-based Deep Learning and Traditional Breast Cancer Risk Models in Patients Who Underwent Supplemental Screening with MRI

Radiology. 2023 Sep;308(3):e223077. doi: 10.1148/radiol.223077.

Abstract

Background Access to supplemental screening breast MRI is determined using traditional risk models, which are limited by modest predictive accuracy. Purpose To compare the diagnostic accuracy of a mammogram-based deep learning (DL) risk assessment model to that of traditional breast cancer risk models in patients who underwent supplemental screening with MRI. Materials and Methods This retrospective study included consecutive patients undergoing breast cancer screening MRI from September 2017 to September 2020 at four facilities. Risk was assessed using the Tyrer-Cuzick (TC) and National Cancer Institute Breast Cancer Risk Assessment Tool (BCRAT) 5-year and lifetime models as well as a DL 5-year model that generated a risk score based on the most recent screening mammogram. A risk score of 1.67% or higher defined increased risk for traditional 5-year models, a risk score of 20% or higher defined high risk for traditional lifetime models, and absolute scores of 2.3 or higher and 6.6 or higher defined increased and high risk, respectively, for the DL model. Model accuracy metrics including cancer detection rate (CDR) and positive predictive values (PPVs) (PPV of abnormal findings at screening [PPV1], PPV of biopsies recommended [PPV2], and PPV of biopsies performed [PPV3]) were compared using logistic regression models. Results This study included 2168 women who underwent 4247 high-risk screening MRI examinations (median age, 54 years [IQR, 48-60 years]). CDR (per 1000 examinations) was higher in patients at high risk according to the DL model (20.6 [95% CI: 11.8, 35.6]) than according to the TC (6.0 [95% CI: 2.9, 12.3]; P < .01) and BCRAT (6.8 [95% CI: 2.9, 15.8]; P = .04) lifetime models. PPV1, PPV2, and PPV3 were higher in patients identified as high risk by the DL model (PPV1, 14.6%; PPV2, 32.4%; PPV3, 36.4%) than those identified as high risk with the TC (PPV1, 5.0%; PPV2, 12.7%; PPV3, 13.5%; P value range, .02-.03) and BCRAT (PPV1, 5.5%; PPV2, 11.1%; PPV3, 12.5%; P value range, .02-.05) lifetime models. Conclusion Patients identified as high risk by a mammogram-based DL risk assessment model showed higher CDR at breast screening MRI than patients identified as high risk with traditional risk models. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Bae in this issue.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms* / diagnostic imaging
  • Deep Learning*
  • Early Detection of Cancer
  • Female
  • Humans
  • Magnetic Resonance Imaging
  • Middle Aged
  • Retrospective Studies