Vocal cord lesions classification based on deep convolutional neural network and transfer learning

Med Phys. 2022 Jan;49(1):432-442. doi: 10.1002/mp.15371. Epub 2021 Dec 8.

Abstract

Purpose: Laryngoscopy, the most common diagnostic method for vocal cord lesions (VCLs), is based mainly on the visual subjective inspection of otolaryngologists. This study aimed to establish a highly objective computer-aided VCLs diagnosis system based on deep convolutional neural network (DCNN) and transfer learning.

Methods: To classify VCLs, our method combined the DCNN backbone with transfer learning on a system specifically finetuned for a laryngoscopy image dataset. Laryngoscopy image database was collected to train the proposed system. The diagnostic performance was compared with other DCNN-based models. Analysis of F1 score and receiver operating characteristic curves were conducted to evaluate the performance of the system.

Results: Beyond the existing VCLs diagnosis method, the proposed system achieved an overall accuracy of 80.23%, an F1 score of 0.7836, and an area under the curve (AUC) of 0.9557 for four fine-grained classes of VCLs, namely, normal, polyp, keratinization, and carcinoma. It also demonstrated robust classification capacity for detecting urgent (keratinization, carcinoma) and non-urgent (normal, polyp), with an overall accuracy of 0.939, a sensitivity of 0.887, a specificity of 0.993, and an AUC of 0.9828. The proposed method also outperformed clinicians in the classification of normal, polyps, and carcinoma at an extremely low time cost.

Conclusion: The VCLs diagnosis system succeeded in using DCNN to distinguish the most common VCLs and normal cases, holding a practical potential for improving the overall diagnostic efficacy in VCLs examinations. The proposed VCLs diagnosis system could be appropriately integrated into the conventional workflow of VCLs laryngoscopy as a highly objective auxiliary method.

Keywords: computer-aided diagnosis; deep learning; laryngoscopy; transfer learning; vocal cord lesion classification.

MeSH terms

  • Area Under Curve
  • Machine Learning
  • Neural Networks, Computer*
  • ROC Curve
  • Vocal Cords* / diagnostic imaging