Support of deep learning to classify vocal fold images in flexible laryngoscopy

Bich Anh Tran; Thao Thi Phuong Dao; Ho Dang Quy Dung; Ngoc Boi Van; Chanh Cong Ha; Nam Hoang Pham; Tu Cong Huyen Ton Nu Cam Nguyen; Tan-Cong Nguyen; Minh-Khoi Pham; Mai-Khiem Tran; Truong Minh Tran; Minh-Triet Tran

doi:10.1016/j.amjoto.2023.103800

Support of deep learning to classify vocal fold images in flexible laryngoscopy

Am J Otolaryngol. 2023 May-Jun;44(3):103800. doi: 10.1016/j.amjoto.2023.103800. Epub 2023 Feb 24.

Authors

Affiliations

¹ Otorhinolaryngology Department, Cho Ray Hospital, Ho Chi Minh City, Viet Nam. Electronic address: trananhbich2015@gmail.com.
² University of Science, VNUHCM, Ho Chi Minh City, Viet Nam; John von Neumann Institute, VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam; Department of Otolaryngology, Thong Nhat Hospital, Ho Chi Minh City, Viet Nam. Electronic address: thao.dao2020@ict.jvn.edu.vn.
³ Department of Endoscopy, Cho Ray Hospital, Ho Chi Minh City, Viet Nam. Electronic address: quydung@gmail.com.
⁴ Department of Otolaryngology, Vinmec Central Park International Hospital, Ho Chi Minh City, Viet Nam. Electronic address: vanboingoc@gmail.com.
⁵ Department of Otolaryngology, 7A Military Hospital, Ho Chi Minh City, Viet Nam. Electronic address: chanhhacong@gmail.com.
⁶ Otorhinolaryngology Department, Cho Ray Hospital, Ho Chi Minh City, Viet Nam. Electronic address: bsnamtmh@gmail.com.
⁷ Department of Head and Neck Surgery, ENT Hospital, Ho Chi Minh City, Viet Nam. Electronic address: camtu_nguyen@yahoo.com.
⁸ University of Science, VNUHCM, Ho Chi Minh City, Viet Nam; University of Social Sciences and Humanities, VNUHCM, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Viet Nam. Electronic address: ntcong@hcmussh.edu.vn.
⁹ University of Science, VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam. Electronic address: pmkhoi@selab.hcmus.edu.vn.
¹⁰ University of Science, VNUHCM, Ho Chi Minh City, Viet Nam; John von Neumann Institute, VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam. Electronic address: tmkhiem@selab.hcmus.edu.vn.
¹¹ Otorhinolaryngology Department, Cho Ray Hospital, Ho Chi Minh City, Viet Nam. Electronic address: tranminhtruongcr@gmail.com.
¹² University of Science, VNUHCM, Ho Chi Minh City, Viet Nam; John von Neumann Institute, VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam. Electronic address: tmtriet@fit.hcmus.edu.vn.

PMID: 36905912
DOI: 10.1016/j.amjoto.2023.103800

Abstract

Purpose: To collect a dataset with adequate laryngoscopy images and identify the appearance of vocal folds and their lesions in flexible laryngoscopy images by objective deep learning models.

Methods: We adopted a number of novel deep learning models to train and classify 4549 flexible laryngoscopy images as no vocal fold, normal vocal folds, and abnormal vocal folds. This could help these models recognize vocal folds and their lesions within these images. Ultimately, we made a comparison between the results of the state-of-the-art deep learning models, and another comparison of the results between the computer-aided classification system and ENT doctors.

Results: This study exhibited the performance of the deep learning models by evaluating laryngoscopy images collected from 876 patients. The efficiency of the Xception model was higher and steadier than almost the rest of the models. The accuracy of no vocal fold, normal vocal folds, and vocal fold abnormalities on this model were 98.90 %, 97.36 %, and 96.26 %, respectively. Compared to our ENT doctors, the Xception model produced better results than a junior doctor and was near an expert.

Conclusion: Our results show that current deep learning models can classify vocal fold images well and effectively assist physicians in vocal fold identification and classification of normal or abnormal vocal folds.

Keywords: Computer-aided diagnosis; Deep learning; Flexible laryngoscopy; Vocal folds.

MeSH terms

Deep Learning*
Humans
Laryngoscopy* / methods
Vocal Cords / diagnostic imaging
Vocal Cords / pathology