Machine Learning Assessment of Spasmodic Dysphonia Based on Acoustical and Perceptual Parameters

Bioengineering (Basel). 2023 Mar 28;10(4):426. doi: 10.3390/bioengineering10040426.

Abstract

Adductor spasmodic dysphonia is a type of adult-onset focal dystonia characterized by involuntary spasms of laryngeal muscles. This paper applied machine learning techniques for the severity assessment of spasmodic dysphonia. To this aim, 7 perceptual indices and 48 acoustical parameters were estimated from the Italian word /a'jwɔle/ emitted by 28 female patients, manually segmented from a standardized sentence and used as features in two classification experiments. Subjects were divided into three severity classes (mild, moderate, severe) on the basis of the G (grade) score of the GRB scale. The first aim was that of finding relationships between perceptual and objective measures with the Local Interpretable Model-Agnostic Explanations method. Then, the development of a diagnostic tool for adductor spasmodic dysphonia severity assessment was investigated. Reliable relationships between G; R (Roughness); B (Breathiness); Spasmodicity; and the acoustical parameters: voiced percentage, F2 median, and F1 median were found. After data scaling, Bayesian hyperparameter optimization, and leave-one-out cross-validation, a k-nearest neighbors model provided 89% accuracy in distinguishing patients among the three severity classes. The proposed methods highlighted the best acoustical parameters that could be used jointly with GRB indices to support the perceptual evaluation of spasmodic dysphonia and provide a tool to help severity assessment of spasmodic dysphonia.

Keywords: BioVoice; LIME; acoustical analysis; machine learning; spasmodic dysphonia; voice assessment.