Performance of a Region of Interest-based Algorithm in Diagnosing International Society of Urological Pathology Grade Group ≥2 Prostate Cancer on the MRI-FIRST Database-CAD-FIRST Study

Eur Urol Oncol. 2024 Mar 15:S2588-9311(24)00056-7. doi: 10.1016/j.euo.2024.03.003. Online ahead of print.

Abstract

Background and objective: Prostate multiparametric magnetic resonance imaging (MRI) shows high sensitivity for International Society of Urological Pathology grade group (GG) ≥2 cancers. Many artificial intelligence algorithms have shown promising results in diagnosing clinically significant prostate cancer on MRI. To assess a region-of-interest-based machine-learning algorithm aimed at characterising GG ≥2 prostate cancer on multiparametric MRI.

Methods: The lesions targeted at biopsy in the MRI-FIRST dataset were retrospectively delineated and assessed using a previously developed algorithm. The Prostate Imaging-Reporting and Data System version 2 (PI-RADSv2) score assigned prospectively before biopsy and the algorithm score calculated retrospectively in the regions of interest were compared for diagnosing GG ≥2 cancer, using the areas under the curve (AUCs), and sensitivities and specificities calculated with predefined thresholds (PIRADSv2 scores ≥3 and ≥4; algorithm scores yielding 90% sensitivity in the training database). Ten predefined biopsy strategies were assessed retrospectively.

Key findings and limitations: After excluding 19 patients, we analysed 232 patients imaged on 16 different scanners; 85 had GG ≥2 cancer at biopsy. At patient level, AUCs of the algorithm and PI-RADSv2 were 77% (95% confidence interval [CI]: 70-82) and 80% (CI: 74-85; p = 0.36), respectively. The algorithm's sensitivity and specificity were 86% (CI: 76-93) and 65% (CI: 54-73), respectively. PI-RADSv2 sensitivities and specificities were 95% (CI: 89-100) and 38% (CI: 26-47), and 89% (CI: 79-96) and 47% (CI: 35-57) for thresholds of ≥3 and ≥4, respectively. Using the PI-RADSv2 score to trigger a biopsy would have avoided 26-34% of biopsies while missing 5-11% of GG ≥2 cancers. Combining prostate-specific antigen density, the PI-RADSv2 and algorithm's scores would have avoided 44-47% of biopsies while missing 6-9% of GG ≥2 cancers. Limitations include the retrospective nature of the study and a lack of PI-RADS version 2.1 assessment.

Conclusions and clinical implications: The algorithm provided robust results in the multicentre multiscanner MRI-FIRST database and could help select patients for biopsy.

Patient summary: An artificial intelligence-based algorithm aimed at diagnosing aggressive cancers on prostate magnetic resonance imaging showed results similar to expert human assessment in a prospectively acquired multicentre test database.

Keywords: Artificial intelligence; Magnetic resonance imaging; Prostate biopsy; Prostate cancer; Radiomics.