Auditory inspired machine learning techniques can improve speech intelligibility and quality for hearing-impaired listeners

J Acoust Soc Am. 2017 Mar;141(3):1985. doi: 10.1121/1.4977197.

Abstract

Machine-learning based approaches to speech enhancement have recently shown great promise for improving speech intelligibility for hearing-impaired listeners. Here, the performance of three machine-learning algorithms and one classical algorithm, Wiener filtering, was compared. Two algorithms based on neural networks were examined, one using a previously reported feature set and one using a feature set derived from an auditory model. The third machine-learning approach was a dictionary-based sparse-coding algorithm. Speech intelligibility and quality scores were obtained for participants with mild-to-moderate hearing impairments listening to sentences in speech-shaped noise and multi-talker babble following processing with the algorithms. Intelligibility and quality scores were significantly improved by each of the three machine-learning approaches, but not by the classical approach. The largest improvements for both speech intelligibility and quality were found by implementing a neural network using the feature set based on auditory modeling. Furthermore, neural network based techniques appeared more promising than dictionary-based, sparse coding in terms of performance and ease of implementation.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acoustic Stimulation
  • Aged
  • Audiometry, Speech
  • Electric Stimulation
  • Female
  • Hearing Aids*
  • Hearing Loss / diagnosis
  • Hearing Loss / psychology
  • Hearing Loss / rehabilitation*
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Neural Networks, Computer
  • Noise / adverse effects*
  • Perceptual Masking*
  • Persons With Hearing Impairments / psychology
  • Persons With Hearing Impairments / rehabilitation*
  • Recognition, Psychology
  • Signal Processing, Computer-Assisted*
  • Speech Intelligibility*
  • Speech Perception*