Single-ended prediction of listening effort using deep neural networks

Hear Res. 2018 Mar:359:40-49. doi: 10.1016/j.heares.2017.12.014. Epub 2017 Dec 27.

Abstract

The effort required to listen to and understand noisy speech is an important factor in the evaluation of noise reduction schemes. This paper introduces a model for Listening Effort prediction from Acoustic Parameters (LEAP). The model is based on methods from automatic speech recognition, specifically on performance measures that quantify the degradation of phoneme posteriorgrams produced by a deep neural net: Noise or artifacts introduced by speech enhancement often result in a temporal smearing of phoneme representations, which is measured by comparison of phoneme vectors. This procedure does not require a priori knowledge about the processed speech, and is therefore single-ended. The proposed model was evaluated using three datasets of noisy speech signals with listening effort ratings obtained from normal hearing and hearing impaired subjects. The prediction quality was compared to several baseline models such as the ITU-T standard P.563 for single-ended speech quality assessment, the American National Standard ANIQUE+ for single-ended speech quality assessment, and a single-ended SNR estimator. In all three datasets, the proposed new model achieved clearly better prediction accuracies than the baseline models; correlations with subjective ratings were above 0.9. So far, the model is trained on the specific noise types used in the evaluation. Future work will be concerned with overcoming this limitation by training the model on a variety of different noise types in a multi-condition way in order to make it generalize to unknown noise types.

Keywords: Automatic speech recognition; Deep neural networks; Listening effort prediction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acoustic Stimulation
  • Adult
  • Aged
  • Attention*
  • Audiometry, Speech
  • Auditory Pathways / physiopathology
  • Case-Control Studies
  • Deep Learning*
  • Female
  • Hearing
  • Hearing Disorders / diagnosis
  • Hearing Disorders / physiopathology
  • Hearing Disorders / psychology*
  • Humans
  • Male
  • Middle Aged
  • Models, Psychological*
  • Noise / adverse effects*
  • Perceptual Masking*
  • Persons With Hearing Impairments / psychology*
  • Speech Perception*
  • Young Adult