Machine Learning Assessment of Spasmodic Dysphonia Based on Acoustical and Perceptual Parameters

Federico Calà; Lorenzo Frassineti; Claudia Manfredi; Philippe Dejonckere; Federica Messina; Sergio Barbieri; Lorenzo Pignataro; Giovanna Cantarella

doi:10.3390/bioengineering10040426

Machine Learning Assessment of Spasmodic Dysphonia Based on Acoustical and Perceptual Parameters

Bioengineering (Basel). 2023 Mar 28;10(4):426. doi: 10.3390/bioengineering10040426.

Authors

Federico Calà¹, Lorenzo Frassineti^{1

2}, Claudia Manfredi¹, Philippe Dejonckere³, Federica Messina⁴, Sergio Barbieri⁵, Lorenzo Pignataro^{4

6}, Giovanna Cantarella^{4

6}

Affiliations

¹ Department of Information Engineering, Università degli Studi di Firenze, 50139 Firenze, Italy.
² Genetics, Oncology and Clinical Medicine, Università degli Studi di Siena, 53100 Siena, Italy.
³ Federal Agency for Occupational Risks, 1020 Brussels, Belgium.
⁴ Department of Clinical Sciences and Community Health, University of Milan, 20122 Milan, Italy.
⁵ SC Neurofisiopatologia, Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico, 20122 Milan, Italy.
⁶ Department of Otolaryngology, Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico, 20122 Milan, Italy.

Abstract

Adductor spasmodic dysphonia is a type of adult-onset focal dystonia characterized by involuntary spasms of laryngeal muscles. This paper applied machine learning techniques for the severity assessment of spasmodic dysphonia. To this aim, 7 perceptual indices and 48 acoustical parameters were estimated from the Italian word /a'jwɔle/ emitted by 28 female patients, manually segmented from a standardized sentence and used as features in two classification experiments. Subjects were divided into three severity classes (mild, moderate, severe) on the basis of the G (grade) score of the GRB scale. The first aim was that of finding relationships between perceptual and objective measures with the Local Interpretable Model-Agnostic Explanations method. Then, the development of a diagnostic tool for adductor spasmodic dysphonia severity assessment was investigated. Reliable relationships between G; R (Roughness); B (Breathiness); Spasmodicity; and the acoustical parameters: voiced percentage, F2 median, and F1 median were found. After data scaling, Bayesian hyperparameter optimization, and leave-one-out cross-validation, a k-nearest neighbors model provided 89% accuracy in distinguishing patients among the three severity classes. The proposed methods highlighted the best acoustical parameters that could be used jointly with GRB indices to support the perceptual evaluation of spasmodic dysphonia and provide a tool to help severity assessment of spasmodic dysphonia.

Keywords: BioVoice; LIME; acoustical analysis; machine learning; spasmodic dysphonia; voice assessment.

Grants and funding

2018.0976/Ente Cassa di Risparmio di Firenze