Real-to-virtual domain transfer-based depth estimation for real-time 3D annotation in transnasal surgery: a study of annotation accuracy and stability

Int J Comput Assist Radiol Surg. 2021 May;16(5):731-739. doi: 10.1007/s11548-021-02346-9. Epub 2021 Mar 30.

Abstract

Purpose: Surgical annotation promotes effective communication between medical personnel during surgical procedures. However, existing approaches to 2D annotations are mostly static with respect to a display. In this work, we propose a method to achieve 3D annotations that anchor rigidly and stably to target structures upon camera movement in a transnasal endoscopic surgery setting.

Methods: This is accomplished through intra-operative endoscope tracking and monocular depth estimation. A virtual endoscopic environment is utilized to train a supervised depth estimation network. An adversarial network transfers the style from the real endoscopic view to a synthetic-like view for input into the depth estimation network, wherein framewise depth can be obtained in real time.

Results: (1) Accuracy: Framewise depth was predicted from images captured from within a nasal airway phantom and compared with ground truth, achieving a SSIM value of 0.8310 ± 0.0655. (2) Stability: mean absolute error (MAE) between reference and predicted depth of a target point was 1.1330 ± 0.9957 mm.

Conclusion: Both the accuracy and stability evaluations demonstrated the feasibility and practicality of our proposed method for achieving 3D annotations.

Keywords: Augmented reality; Domain transfer learning; Monocular depth estimation; Surgical annotation; Transnasal surgery.

MeSH terms

  • Cadaver
  • Calibration
  • Endoscopy / methods*
  • Humans
  • Image Processing, Computer-Assisted
  • Imaging, Three-Dimensional / methods*
  • Monitoring, Intraoperative
  • Phantoms, Imaging*
  • Reproducibility of Results
  • Tomography, X-Ray Computed
  • Video Recording