A deep learning-based 3D Prompt-nnUnet model for automatic segmentation in brachytherapy of postoperative endometrial carcinoma

J Appl Clin Med Phys. 2024 Apr 29:e14371. doi: 10.1002/acm2.14371. Online ahead of print.

Abstract

Purpose: To create and evaluate a three-dimensional (3D) Prompt-nnUnet module that utilizes the prompts-based model combined with 3D nnUnet for producing the rapid and consistent autosegmentation of high-risk clinical target volume (HR CTV) and organ at risk (OAR) in high-dose-rate brachytherapy (HDR BT) for patients with postoperative endometrial carcinoma (EC).

Methods and materials: On two experimental batches, a total of 321 computed tomography (CT) scans were obtained for HR CTV segmentation from 321 patients with EC, and 125 CT scans for OARs segmentation from 125 patients. The numbers of training/validation/test were 257/32/32 and 87/13/25 for HR CTV and OARs respectively. A novel comparison of the deep learning neural network 3D Prompt-nnUnet and 3D nnUnet was applied for HR CTV and OARs segmentation. Three-fold cross validation and several quantitative metrics were employed, including Dice similarity coefficient (DSC), Hausdorff distance (HD), 95th percentile of Hausdorff distance (HD95%), and intersection over union (IoU).

Results: The Prompt-nnUnet included two forms of parameters Predict-Prompt (PP) and Label-Prompt (LP), with the LP performing most similarly to the experienced radiation oncologist and outperforming the less experienced ones. During the testing phase, the mean DSC values for the LP were 0.96 ± 0.02, 0.91 ± 0.02, and 0.83 ± 0.07 for HR CTV, rectum and urethra, respectively. The mean HD values (mm) were 2.73 ± 0.95, 8.18 ± 4.84, and 2.11 ± 0.50, respectively. The mean HD95% values (mm) were 1.66 ± 1.11, 3.07 ± 0.94, and 1.35 ± 0.55, respectively. The mean IoUs were 0.92 ± 0.04, 0.84 ± 0.03, and 0.71 ± 0.09, respectively. A delineation time < 2.35 s per structure in the new model was observed, which was available to save clinician time.

Conclusion: The Prompt-nnUnet architecture, particularly the LP, was highly consistent with ground truth (GT) in HR CTV or OAR autosegmentation, reducing interobserver variability and shortening treatment time.

Keywords: EC; HDR BT; Prompt‐nnUnet deep learning (DL) model; autosegmentation of HR CTV or OAR.