FetNet: a recurrent convolutional network for occlusion identification in fetoscopic videos

Sophia Bano; Francisco Vasconcelos; Emmanuel Vander Poorten; Tom Vercauteren; Sebastien Ourselin; Jan Deprest; Danail Stoyanov

doi:10.1007/s11548-020-02169-0

FetNet: a recurrent convolutional network for occlusion identification in fetoscopic videos

Int J Comput Assist Radiol Surg. 2020 May;15(5):791-801. doi: 10.1007/s11548-020-02169-0. Epub 2020 Apr 29.

Authors

Sophia Bano¹, Francisco Vasconcelos², Emmanuel Vander Poorten³, Tom Vercauteren⁴, Sebastien Ourselin⁴, Jan Deprest⁵, Danail Stoyanov²

Affiliations

¹ Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS) and Department of Computer Science, University College London, London, UK. sophia.bano@ucl.ac.uk.
² Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS) and Department of Computer Science, University College London, London, UK.
³ Department of Mechanical Engineering, KU Leuven University, Leuven, Belgium.
⁴ School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK.
⁵ Department of Development and Regeneration, University Hospital Leuven, Leuven, Belgium.

Abstract

Purpose: Fetoscopic laser photocoagulation is a minimally invasive surgery for the treatment of twin-to-twin transfusion syndrome (TTTS). By using a lens/fibre-optic scope, inserted into the amniotic cavity, the abnormal placental vascular anastomoses are identified and ablated to regulate blood flow to both fetuses. Limited field-of-view, occlusions due to fetus presence and low visibility make it difficult to identify all vascular anastomoses. Automatic computer-assisted techniques may provide better understanding of the anatomical structure during surgery for risk-free laser photocoagulation and may facilitate in improving mosaics from fetoscopic videos.

Methods: We propose FetNet, a combined convolutional neural network (CNN) and long short-term memory (LSTM) recurrent neural network architecture for the spatio-temporal identification of fetoscopic events. We adapt an existing CNN architecture for spatial feature extraction and integrated it with the LSTM network for end-to-end spatio-temporal inference. We introduce differential learning rates during the model training to effectively utilising the pre-trained CNN weights. This may support computer-assisted interventions (CAI) during fetoscopic laser photocoagulation.

Results: We perform quantitative evaluation of our method using 7 in vivo fetoscopic videos captured from different human TTTS cases. The total duration of these videos was 5551 s (138,780 frames). To test the robustness of the proposed approach, we perform 7-fold cross-validation where each video is treated as a hold-out or test set and training is performed using the remaining videos.

Conclusion: FetNet achieved superior performance compared to the existing CNN-based methods and provided improved inference because of the spatio-temporal information modelling. Online testing of FetNet, using a Tesla V100-DGXS-32GB GPU, achieved a frame rate of 114 fps. These results show that our method could potentially provide a real-time solution for CAI and automating occlusion and photocoagulation identification during fetoscopic procedures.

Keywords: Computer assisted interventions (CAI); Deep learning; Fetoscopy; Surgical vision; Twin-to-twin transfusion syndrome (TTTS); Video segmentation.

MeSH terms

Female
Fetofetal Transfusion / surgery*
Fetoscopy / methods*
Humans
Laser Coagulation / methods*
Neural Networks, Computer*
Pregnancy

Abstract

MeSH terms

Grants and funding