Hyoid Bone Tracking in a Videofluoroscopic Swallowing Study Using a Deep-Learning-Based Segmentation Network

Diagnostics (Basel). 2021 Jun 23;11(7):1147. doi: 10.3390/diagnostics11071147.

Abstract

Kinematic analysis of the hyoid bone in a videofluorosopic swallowing study (VFSS) is important for assessing dysphagia. However, calibrating the hyoid bone movement is time-consuming, and its reliability shows wide variation. Computer-assisted analysis has been studied to improve the efficiency and accuracy of hyoid bone identification and tracking, but its performance is limited. In this study, we aimed to design a robust network that can track hyoid bone movement automatically without human intervention. Using 69,389 frames from 197 VFSS files as the data set, a deep learning model for detection and trajectory prediction was constructed and trained by the BiFPN-U-Net(T) network. The present model showed improved performance when compared with the previous models: an area under the curve (AUC) of 0.998 for pixelwise accuracy, an accuracy of object detection of 99.5%, and a Dice similarity of 90.9%. The bounding box detection performance for the hyoid bone and reference objects was superior to that of other models, with a mean average precision of 95.9%. The estimation of the distance of hyoid bone movement also showed higher accuracy. The deep learning model proposed in this study could be used to detect and track the hyoid bone more efficiently and accurately in VFSS analysis.

Keywords: deep learning; dysphagia; hyoid bone; videofluoroscopy.