Laryngeal surface reconstructions from monocular endoscopic videos: a structure from motion pipeline for periodic deformations

Justin Regef; Likhit Talasila; Julia Wiercigroch; R Jun Lin; Lueder A Kahrs

doi:10.1007/s11548-024-03118-x

Laryngeal surface reconstructions from monocular endoscopic videos: a structure from motion pipeline for periodic deformations

Int J Comput Assist Radiol Surg. 2024 Apr 23. doi: 10.1007/s11548-024-03118-x. Online ahead of print.

Authors

Justin Regef^{1

2}, Likhit Talasila^{3

4}, Julia Wiercigroch^{3

5}, R Jun Lin⁶, Lueder A Kahrs^{3

4

5

6

7}

Affiliations

¹ Medical Computer Vision and Robotics Lab, University of Toronto, Toronto, ON, Canada. justin.regef@mail.utoronto.ca.
² Department of Mathematical and Computational Sciences, University of Toronto Mississauga, 3359 Mississauga Rd, Mississauga, ON, L5L 1C6, Canada. justin.regef@mail.utoronto.ca.
³ Medical Computer Vision and Robotics Lab, University of Toronto, Toronto, ON, Canada.
⁴ Department of Mathematical and Computational Sciences, University of Toronto Mississauga, 3359 Mississauga Rd, Mississauga, ON, L5L 1C6, Canada.
⁵ Department of Computer Science, University of Toronto, 40 St George St, Toronto, ON, M5S 2E4, Canada.
⁶ Department of Otolaryngology - Head & Neck Surgery, Unity Health Toronto - St. Michael's Hospital, Temerty Faculty of Medicine, University of Toronto, 36 Queen St E, Toronto, ON, M5B 1W8, Canada.
⁷ Institute of Biomedical Engineering, University of Toronto, 164 College Street, Toronto, ON, M5S 3G9, Canada.

PMID: 38652415
DOI: 10.1007/s11548-024-03118-x

Abstract

Purpose: Surface reconstructions from laryngoscopic videos have the potential to assist clinicians in diagnosing, quantifying, and monitoring airway diseases using minimally invasive techniques. However, tissue movements and deformations make these reconstructions challenging using conventional pipelines.

Methods: To facilitate such reconstructions, we developed video frame pre-filtering and featureless dense matching steps to enhance the Alicevision Meshroom SfM pipeline. Time and the anterior glottic angle were used to approximate the rigid state of the airway and to collect frames with different camera poses. Featureless dense matches were tracked with a correspondence transformer across subsets of images to extract matched points that could be used to estimate the point cloud and reconstructed surface. The proposed pipeline was tested on a simulated dataset under various conditions like illumination and resolution as well as real laryngoscopic videos.

Results: Our pipeline was able to reconstruct the laryngeal region based on 4, 8, and 16 images obtained from simulated and real patient exams. The pipeline was robust to sparse inputs, blur, and extreme lighting conditions, unlike the Meshroom pipeline which failed to produce a point cloud for 6 of 15 simulated datasets.

Conclusion: The pre-filtering and featureless dense matching modules specialize the conventional SfM pipeline to handle the challenging laryngoscopic examinations, directly from patient videos. These 3D visualizations have the potential to improve spatial understanding of airway conditions.

Keywords: 3D reconstruction; Laryngoscope; Photogrammetry; Structure from motion.