Speech transmission index from running speech: a neural network approach

F F Li; T J Cox

doi:10.1121/1.1558373

Speech transmission index from running speech: a neural network approach

J Acoust Soc Am. 2003 Apr;113(4 Pt 1):1999-2008. doi: 10.1121/1.1558373.

Authors

F F Li¹, T J Cox

Affiliation

¹ School of Acoustics and Electronic Engineering, University of Salford, Salford MS 4WT, United Kingdom.

PMID: 12703711
DOI: 10.1121/1.1558373

Abstract

Speech transmission index (STI) is an important objective parameter concerning speech intelligibility for sound transmission channels. It is normally measured with specific test signals to ensure high accuracy and good repeatability. Measurement with running speech was previously proposed, but accuracy is compromised and hence applications limited. A new approach that uses artificial neural networks to accurately extract the STI from received running speech is developed in this paper. Neural networks are trained on a large set of transmitted speech examples with prior knowledge of the transmission channels' STIs. The networks perform complicated nonlinear function mappings and spectral feature memorization to enable accurate objective parameter extraction from transmitted speech. Validations via simulations demonstrate the feasibility of this new method on a one-net-one-speech extract basis. In this case, accuracy is comparable with normal measurement methods. This provides an alternative to standard measurement techniques, and it is intended that the neural network method can facilitate occupied room acoustic measurements.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Humans
Neural Networks, Computer*
Nonlinear Dynamics
Sound Spectrography
Speech Acoustics
Speech Intelligibility*
Speech Perception*