Developing an Artificial Intelligence Tool to Predict Vocal Cord Pathology in Primary Care Settings

Laryngoscope. 2023 Aug;133(8):1952-1960. doi: 10.1002/lary.30432. Epub 2022 Oct 13.

Abstract

Objectives: Diagnostic tools for voice disorders are lacking for primary care physicians. Artificial intelligence (AI) tools may add to the armamentarium for physicians, decreasing the time to diagnosis and limiting the burden of dysphonia.

Methods: Voice recordings of patients were collected from 2019 to 2021 using smartphones. The Saarbruecken dataset was included for comparison. Audio files were converted to mel-spectrograms using TensorFlow. Diagnostic categories were created to group pathology, including neurological and muscular disorders, inflammatory, mass lesions, and normal. The samples were further separated into sustained/a/and the rainbow passage.

Results: Two hundred three prospective samples and 1131 samples were used from the Saarbruecken database. The AI detected abnormal pathology with an F1-score of 98%. The artificial neural network (ANN) differentiated key pathologies, including unilateral paralysis, laryngitis, adductor spasmodic dysphonia (ADSD), mass lesions, and normal samples with 39%-87% F-1 scores. The Calgary database models had higher F-1 scores in a head-to-head comparison to the Saarbruecken and combined datasets (87% vs. 58% and 50%). The AI outperformed otolaryngologists using a standardized test set of recordings (83% compared to 55% ± 15%).

Conclusion: An AI tool was created to differentiate pathology by individual or categorical diagnosis with high evaluation metrics. Prospective data should be collected in a controlled fashion to reduce intrinsic variability between recordings. Multi-center data collaborations are imperative to increase the prediction capability of AI tools for detecting vocal cord pathology. We provide proof-of-concept for an AI tool to assist primary care physicians in managing dysphonic patients.

Level of evidence: 3 Laryngoscope, 133:1952-1960, 2023.

Keywords: artificial intelligence; artificial neural network; dysphonia; voice.

MeSH terms

  • Artificial Intelligence
  • Dysphonia* / diagnosis
  • Humans
  • Primary Health Care
  • Prospective Studies
  • Vocal Cords