Estimation of phoneme-specific HMM topologies for the automatic recognition of dysarthric speech

Santiago-Omar Caballero-Morales

doi:10.1155/2013/297860

Estimation of phoneme-specific HMM topologies for the automatic recognition of dysarthric speech

Comput Math Methods Med. 2013:2013:297860. doi: 10.1155/2013/297860. Epub 2013 Oct 8.

Author

Santiago-Omar Caballero-Morales¹

Affiliation

¹ Technological University of the Mixteca, Road to Acatlima K.m. 2.5, Huajuapan de León, 69000 Oaxaca, OAX, Mexico.

Abstract

Dysarthria is a frequently occurring motor speech disorder which can be caused by neurological trauma, cerebral palsy, or degenerative neurological diseases. Because dysarthria affects phonation, articulation, and prosody, spoken communication of dysarthric speakers gets seriously restricted, affecting their quality of life and confidence. Assistive technology has led to the development of speech applications to improve the spoken communication of dysarthric speakers. In this field, this paper presents an approach to improve the accuracy of HMM-based speech recognition systems. Because phonatory dysfunction is a main characteristic of dysarthric speech, the phonemes of a dysarthric speaker are affected at different levels. Thus, the approach consists in finding the most suitable type of HMM topology (Bakis, Ergodic) for each phoneme in the speaker's phonetic repertoire. The topology is further refined with a suitable number of states and Gaussian mixture components for acoustic modelling. This represents a difference when compared with studies where a single topology is assumed for all phonemes. Finding the suitable parameters (topology and mixtures components) is performed with a Genetic Algorithm (GA). Experiments with a well-known dysarthric speech database showed statistically significant improvements of the proposed approach when compared with the single topology approach, even for speakers with severe dysarthria.

MeSH terms

Algorithms
Communication Aids for Disabled*
Dysarthria / physiopathology*
Dysarthria / rehabilitation*
Humans
Markov Chains
Neural Networks, Computer
Speech Acoustics
Speech Recognition Software*