Incremental retraining, clinical implementation, and acceptance rate of deep learning auto-segmentation for male pelvis in a multiuser environment

Med Phys. 2023 Jul;50(7):4079-4091. doi: 10.1002/mp.16537. Epub 2023 Jun 7.

Abstract

Background: Deep learning auto-segmentation (DLAS) models have been adopted in the clinic; however, they suffer from performance deterioration owing to the clinical practice variability. Some commercial DLAS software provide an incremental retraining function that enables users to train a custom model using their institutional data to account for clinical practice variability.

Purpose: This study was performed to evaluate and implement the commercial DLAS software with the incremental retraining function for definitive treatment of patients with prostate cancer in a multi-user environment.

Methods: CT-based target organs and organs-at-risk (OAR) delineation of 215 prostate cancer patients were utilized. The performance of three commercial DLAS software built-in models was validated with 20 patients. A retrained custom model was developed using 100 patients and evaluated on the remaining data (n = 115). Dice similarity coefficient (DSC), Hausdorff distance (HD), mean surface distance (MSD), and surface DSC (SDSC) were utilized for quantitative evaluation. A multi-rater qualitative evaluation was blindly performed with a five-level scale. Visual inspection was performed in consensus and non-consensus unacceptable cases to identify the failure modes.

Results: Three commercial DLAS vendor built-in models achieved sub-optimal performance in 20 patients. The retrained custom model had a mean DSC of 0.82 for prostate, 0.48 for seminal vesicles (SV), and 0.92 for rectum, respectively. This represents a significant improvement over the built-in model with DSC of 0.73, 0.37, and 0.81 for the corresponding structures. Compared to the acceptance rate of 96.5% and consensus unacceptable rate (i.e., both reviewers rated as unacceptable) of 3.5% achieved by manual contours, the custom model achieved a 91.3% acceptance rate and 8.7% consensus unacceptable rate. The failure modes of retrained custom model were attributed to the following: cystogram (n = 2), hip prosthesis (n = 2), low dose rate brachytherapy seeds (n = 2), air in endorectal balloon(n = 1), non-iodinated spacer (n = 2), and giant bladder(n = 1).

Conclusion: The commercial DLAS software with the incremental retraining function was validated and clinically adopted for prostate patients in a multi-user environment. AI-based auto-delineation of the prostate and OARs is shown to achieve improved physician acceptance, overall clinical utility, and accuracy.

Keywords: deep learning auto-segmentation; inter-observer contour variation; prostate radiotherapy.

MeSH terms

  • Deep Learning*
  • Humans
  • Image Processing, Computer-Assisted
  • Male
  • Organs at Risk
  • Pelvis
  • Prostatic Neoplasms* / radiotherapy
  • Radiotherapy Planning, Computer-Assisted