From quantitative metrics to clinical success: assessing the utility of deep learning for tumor segmentation in breast surgery

Int J Comput Assist Radiol Surg. 2024 Apr 20. doi: 10.1007/s11548-024-03133-y. Online ahead of print.

Abstract

Purpose: Preventing positive margins is essential for ensuring favorable patient outcomes following breast-conserving surgery (BCS). Deep learning has the potential to enable this by automatically contouring the tumor and guiding resection in real time. However, evaluation of such models with respect to pathology outcomes is necessary for their successful translation into clinical practice.

Methods: Sixteen deep learning models based on established architectures in the literature are trained on 7318 ultrasound images from 33 patients. Models are ranked by an expert based on their contours generated from images in our test set. Generated contours from each model are also analyzed using recorded cautery trajectories of five navigated BCS cases to predict margin status. Predicted margins are compared with pathology reports.

Results: The best-performing model using both quantitative evaluation and our visual ranking framework achieved a mean Dice score of 0.959. Quantitative metrics are positively associated with expert visual rankings. However, the predictive value of generated contours was limited with a sensitivity of 0.750 and a specificity of 0.433 when tested against pathology reports.

Conclusion: We present a clinical evaluation of deep learning models trained for intraoperative tumor segmentation in breast-conserving surgery. We demonstrate that automatic contouring is limited in predicting pathology margins despite achieving high performance on quantitative metrics.

Keywords: Breast ultrasound; Clinical evaluation; Deep learning; Surgical navigation.