Different Performances of Machine Learning Models to Classify Dysphonic and Non-Dysphonic Voices

Danilo Rangel Arruda Leite; Ronei Marcos de Moraes; Leonardo Wanderley Lopes

doi:10.1016/j.jvoice.2022.11.001

Different Performances of Machine Learning Models to Classify Dysphonic and Non-Dysphonic Voices

J Voice. 2022 Dec 10:S0892-1997(22)00353-8. doi: 10.1016/j.jvoice.2022.11.001. Online ahead of print.

Authors

Danilo Rangel Arruda Leite¹, Ronei Marcos de Moraes², Leonardo Wanderley Lopes³

Affiliations

¹ Department of Statistics, Graduate Program in Health Decision Models, Universidade Federal da Paraíba - UFPB, João Pessoa, Paraíba, Brasil; Brazilian Hospital Services Company- Ebserh, Universidade Federal da Paraíba - UFPB, João Pessoa, Paraíba, Brasil.
² Department of Statistics, Graduate Program in Health Decision Models, Universidade Federal da Paraíba - UFPB, João Pessoa, Paraíba, Brasil; Department of Statistics, Universidade Federal da Paraíba - UFPB, João Pessoa, Paraíba, Brasil.
³ Department of Statistics, Graduate Program in Health Decision Models, Universidade Federal da Paraíba - UFPB, João Pessoa, Paraíba, Brasil; Department of Speech-Language and Hearing Sciences, Graduate Program in Linguistics, Universidade Federal da Paraíba - UFPB, João Pessoa, Paraíba, Brasil. Electronic address: lwlopes@hotmail.com.

PMID: 36513560
DOI: 10.1016/j.jvoice.2022.11.001

Abstract

Objective: To analyze the performance of 10 different machine learning (ML) classifiers for discrimination between dysphonic and non-dysphonic voices, using a variance threshold as a method for the selection and reduction of acoustic measurements used in the classifier.

Method: We analyzed 435 samples of individuals (337 female and 98 male), with a mean age of 41.07 ± 13.73 years, of which 384 were dysphonic and 51 were non-dysphonic. From the sustained /ε/ vowel sample, 34 acoustic measurements were extracted, including traditional perturbation and noise measurements, cepstral/spectral measurements, and measurements based on nonlinear models. The variance method was used to select the best set of acoustic measurements. We tested the performance of the best-selected set with 10 ML classifiers using precision, sensitivity, specificity, accuracy, and F1-Score measurements. The kappa coefficient was used to verify the reproducibility between the two datasets (training and testing).

Results: The naive Bayes (NB) and stochastic gradient descent classifier (SGDC) models performed best in terms of accuracy, AUC, sensitivity, and specificity for a reduced dataset of 15 acoustic measures compared to the full dataset of 34 acoustic measures. SGDC and NB obtained the best performance results, with an accuracy of 0.91 and 0.76, respectively. These two classifiers presented moderate agreement, with a Kappa of 0.57 (SGDC) and 0.45 (NB).

Conclusion: Among the tested models, the NB and SGDC models performed better in discriminating between dysphonic and non-dysphonic voices from a set of 15 acoustic measures.

Keywords: Acoustic; Artificial intelligence; Machine learning; Voice; Voice disorders.