Speech enhancement based on physiological and psychoacoustical models of modulation perception and binaural interaction

J Acoust Soc Am. 1994 Mar;95(3):1593-602. doi: 10.1121/1.408546.

Abstract

A novel approach for analyzing and filtering speech is described and evaluated which utilizes the "modulation spectrogram," i.e., the two-dimensional representation of modulation frequencies versus center frequency as a function of time. This approach is based on physiological findings of a tonotopical organization of modulation frequencies perpendicular to carrier frequencies as well as psychoacoustical findings of "modulation tuning curves." In addition, an interaction is assumed between the representation of modulation frequencies and the representation of auditory space as described by physiological and psychological models of binaural hearing. A noise-reduction algorithm based on this approach was implemented and tested which enhances or suppresses each combination of modulation frequency and center frequency according to its phase and intensity relation between the two input signals (i.e., both stereo channels of a dummy-head recording). When tested in several situations with interfering speakers and background noise both in anechoic and reverberant environment, the algorithm provided a small but a very robust increase in speech intelligibility which corresponds to approximately 2 dB in signal-to-noise ratio. Possible applications of this algorithm are noise reduction in adverse acoustical situations, digital hearing aids, processing schemes and preprocessing for speech recognition.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Algorithms
  • Attention / physiology*
  • Auditory Cortex / physiology
  • Auditory Pathways / physiology
  • Dichotic Listening Tests*
  • Female
  • Functional Laterality / physiology*
  • Geniculate Bodies / physiology
  • Hearing Aids*
  • Humans
  • Inferior Colliculi / physiology
  • Male
  • Perceptual Masking / physiology
  • Psychoacoustics
  • Signal Processing, Computer-Assisted / instrumentation
  • Sound Spectrography / instrumentation
  • Speech Acoustics
  • Speech Intelligibility
  • Speech Perception / physiology*