Vision-Based Assistance for Vocal Fold Identification in Laryngoscopy with Knowledge Distillation

Stud Health Technol Inform. 2024 Jan 25:310:946-950. doi: 10.3233/SHTI231104.

Abstract

Laryngoscopy images play a vital role in merging computer vision and otorhinolaryngology research. However, limited studies offer laryngeal datasets for comparative evaluation. Hence, this study introduces a novel dataset focusing on vocal fold images. Additionally, we propose a lightweight network utilizing knowledge distillation, with our student model achieving around 98.4% accuracy-comparable to the original EfficientNetB1 while reducing model weights by up to 88%. We also present an AI-assisted smartphone solution, enabling a portable and intelligent laryngoscopy system that aids laryngoscopists in efficiently targeting vocal fold areas for observation and diagnosis. To sum up, our contribution includes a laryngeal image dataset and a compressed version of the efficient model, suitable for handheld laryngoscopy devices.

Keywords: Laryngoscopy; knowledge distillation; vision-based assistance; vocal folds.

MeSH terms

  • Humans
  • Intelligence
  • Knowledge
  • Laryngoscopy
  • Larynx*
  • Vocal Cords*