Multi-class motion-based semantic segmentation for ureteroscopy and laser lithotripsy

Soumya Gupta; Sharib Ali; Louise Goldsmith; Ben Turney; Jens Rittscher

doi:10.1016/j.compmedimag.2022.102112

Multi-class motion-based semantic segmentation for ureteroscopy and laser lithotripsy

Comput Med Imaging Graph. 2022 Oct:101:102112. doi: 10.1016/j.compmedimag.2022.102112. Epub 2022 Aug 8.

Authors

Soumya Gupta¹, Sharib Ali², Louise Goldsmith³, Ben Turney³, Jens Rittscher⁴

Affiliations

¹ Institute of Biomedical Engineering (IBME), Department of Engineering Science, University of Oxford, Oxford, UK; Big Data Institute, University of Oxford, Li Ka Shing Centre for Health Information and Discovery, Oxford, UK. Electronic address: soumya.gupta@eng.ox.ac.uk.
² Institute of Biomedical Engineering (IBME), Department of Engineering Science, University of Oxford, Oxford, UK; Big Data Institute, University of Oxford, Li Ka Shing Centre for Health Information and Discovery, Oxford, UK; Oxford NIHR Biomedical Research Centre, University of Oxford, Oxford, UK; School of Computing, University of Leeds, Leeds, UK.
³ Department of Urology, The Churchill, Oxford University Hospitals NHS Trust, Oxford, UK.
⁴ Institute of Biomedical Engineering (IBME), Department of Engineering Science, University of Oxford, Oxford, UK; Big Data Institute, University of Oxford, Li Ka Shing Centre for Health Information and Discovery, Oxford, UK; Oxford NIHR Biomedical Research Centre, University of Oxford, Oxford, UK; Ludwig Institute for Cancer Research, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK. Electronic address: jens.rittscher@eng.ox.ac.uk.

PMID: 36030620
DOI: 10.1016/j.compmedimag.2022.102112

Abstract

Ureteroscopy with laser lithotripsy has evolved as the most commonly used technique for the treatment of kidney stones. Automated segmentation of kidney stones and the laser fiber is an essential initial step to performing any automated quantitative analysis, particularly stone-size estimation, that can be used by the surgeon to decide if the stone requires further fragmentation. However, factors such as turbid fluid inside the cavity, specularities, motion blur due to kidney movements and camera motion, bleeding, and stone debris impact the quality of vision within the kidney, leading to extended operative times. To the best of our knowledge, this is the first attempt made towards multi-class segmentation in ureteroscopy and laser lithotripsy data. We propose an end-to-end convolution neural network (CNN) based learning framework for the segmentation of stones and laser fiber. The proposed approach utilizes two sub-networks: (I) HybResUNet, a hybrid version of residual U-Net, that uses residual connections in the encoder path of the U-Net to improve semantic predictions, and (II) a DVFNet that generates deformation vector field (DVF) predictions by leveraging motion differences between the adjacent video frames which is then used to prune the prediction maps. We also present ablation studies that combine different dilated convolutions, recurrent and residual connections, atrous spatial pyramid pooling, and attention gate models. Further, we propose a compound loss function that significantly boosts the segmentation performance in our data. We have also provided an ablation study to determine the optimal data augmentation strategy for our dataset. Our qualitative and quantitative results illustrate that our proposed method outperforms state-of-the-art methods such as UNet and DeepLabv3+ showing a DSC improvement of 4.15% and 13.34%, respectively, in our in vivo test dataset. We further show that our proposed model outperforms state-of-the-art methods on an unseen out-of-sample clinical dataset with a DSC improvement of 9.61%, 11%, and 5.24% over UNet, HybResUNet, and DeepLabv3+, respectively in the case of the stone class and an improvement of 31.79%, 22.15%, and 10.42% over UNet, HybResUNet, and DeepLabv3+, respectively, in case of the laser class.

Keywords: DVFNet; Deep learning; Kidney stone; Laser lithotripsy; Semantic segmentation; U-net; Ureteroscopy.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Humans
Kidney Calculi* / diagnostic imaging
Kidney Calculi* / surgery
Lithotripsy, Laser* / methods
Neural Networks, Computer
Semantics
Ureteroscopy / methods

Grants and funding

203141/Z/16/Z/WT_/Wellcome Trust/United Kingdom