Speech transmission index from running speech: a neural network approach

J Acoust Soc Am. 2003 Apr;113(4 Pt 1):1999-2008. doi: 10.1121/1.1558373.

Abstract

Speech transmission index (STI) is an important objective parameter concerning speech intelligibility for sound transmission channels. It is normally measured with specific test signals to ensure high accuracy and good repeatability. Measurement with running speech was previously proposed, but accuracy is compromised and hence applications limited. A new approach that uses artificial neural networks to accurately extract the STI from received running speech is developed in this paper. Neural networks are trained on a large set of transmitted speech examples with prior knowledge of the transmission channels' STIs. The networks perform complicated nonlinear function mappings and spectral feature memorization to enable accurate objective parameter extraction from transmitted speech. Validations via simulations demonstrate the feasibility of this new method on a one-net-one-speech extract basis. In this case, accuracy is comparable with normal measurement methods. This provides an alternative to standard measurement techniques, and it is intended that the neural network method can facilitate occupied room acoustic measurements.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Neural Networks, Computer*
  • Nonlinear Dynamics
  • Sound Spectrography
  • Speech Acoustics
  • Speech Intelligibility*
  • Speech Perception*