i3PosNet: instrument pose estimation from X-ray in temporal bone surgery

Int J Comput Assist Radiol Surg. 2020 Jul;15(7):1137-1145. doi: 10.1007/s11548-020-02157-4. Epub 2020 May 21.

Abstract

Purpose: Accurate estimation of the position and orientation (pose) of surgical instruments is crucial for delicate minimally invasive temporal bone surgery. Current techniques lack in accuracy and/or line-of-sight constraints (conventional tracking systems) or expose the patient to prohibitive ionizing radiation (intra-operative CT). A possible solution is to capture the instrument with a c-arm at irregular intervals and recover the pose from the image.

Methods: i3PosNet infers the position and orientation of instruments from images using a pose estimation network. Said framework considers localized patches and outputs pseudo-landmarks. The pose is reconstructed from pseudo-landmarks by geometric considerations.

Results: We show i3PosNet reaches errors [Formula: see text] mm. It outperforms conventional image registration-based approaches reducing average and maximum errors by at least two thirds. i3PosNet trained on synthetic images generalizes to real X-rays without any further adaptation.

Conclusion: The translation of deep learning-based methods to surgical applications is difficult, because large representative datasets for training and testing are not available. This work empirically shows sub-millimeter pose estimation trained solely based on synthetic training data.

Keywords: Cochlear implant; Fluoroscopic tracking; Minimally invasive bone surgery; Modular deep learning; Vestibular schwannoma removal; instrument pose estimation.

MeSH terms

  • Humans
  • Imaging, Three-Dimensional / methods
  • Minimally Invasive Surgical Procedures
  • Otologic Surgical Procedures / methods*
  • Radiography
  • Surgery, Computer-Assisted / methods*
  • Temporal Bone / diagnostic imaging
  • Temporal Bone / surgery*