Artificial Neural Networks Combined with the Principal Component Analysis for Non-Fluent Speech Recognition

Izabela Świetlicka; Wiesława Kuniszyk-Jóźkowiak; Michał Świetlicki

doi:10.3390/s22010321

Artificial Neural Networks Combined with the Principal Component Analysis for Non-Fluent Speech Recognition

Sensors (Basel). 2022 Jan 1;22(1):321. doi: 10.3390/s22010321.

Authors

Izabela Świetlicka¹, Wiesława Kuniszyk-Jóźkowiak², Michał Świetlicki³

Affiliations

¹ Department of Biophysics, University of Life Sciences, Akademicka 13, 20-950 Lublin, Poland.
² Faculty of Physical Education and Health in Biała Podlaska, Józef Piłsudski University of Physical Education in Warsaw, Akademicka 2, 21-500 Biała Podlaska, Poland.
³ Department of Applied Physics, Faculty of Mechanical Engineering, Lublin University of Technology, Nadbystrzycka 36, 20-618 Lublin, Poland.

Abstract

The presented paper introduces principal component analysis application for dimensionality reduction of variables describing speech signal and applicability of obtained results for the disturbed and fluent speech recognition process. A set of fluent speech signals and three speech disturbances-blocks before words starting with plosives, syllable repetitions, and sound-initial prolongations-was transformed using principal component analysis. The result was a model containing four principal components describing analysed utterances. Distances between standardised original variables and elements of the observation matrix in a new system of coordinates were calculated and then applied in the recognition process. As a classifying algorithm, the multilayer perceptron network was used. Achieved results were compared with outcomes from previous experiments where speech samples were parameterised with the Kohonen network application. The classifying network achieved overall accuracy at 76% (from 50% to 91%, depending on the dysfluency type).

Keywords: artificial neural networks; principal component analysis; speech recognition; stuttering.

MeSH terms

Humans
Neural Networks, Computer
Principal Component Analysis
Speech
Speech Perception*
Stuttering*