Comparison of a target-equalization-cancellation approach and a localization approach to source separation

J Acoust Soc Am. 2017 Nov;142(5):2933. doi: 10.1121/1.5009763.

Abstract

Interaural differences are important for listeners to be able to maintain focus on a sound source of interest in the presence of multiple sources. Because interaural differences are sound localization cues, most binaural-cue-based source separation algorithms attempt separation by localizing each time-frequency (T-F) unit to one of the possible source directions using interaural differences. By assembling T-F units that are assigned to one direction, the sound stream from that direction is enhanced. In this paper, a different type of binaural cue for source-separation purposes is proposed. For each T-F unit, the target-direction signal is cancelled by applying the equalization-cancellation (EC) operation to cancel the signal from the target direction; then, the dominance of the target in each T-F unit is determined by the effectiveness of the cancellation. Specifically, the energy change from cancellation is used as the criterion for target dominance for each T-F unit. Source-separation performance using the target-EC cue is compared with performance using localization cues. With simulated multi-talker and diffuse-babble interferers, the algorithm based on target-EC cues yields better source-separation performance than the algorithm based on localization cues, both in direct comparison with the ideal binary mask and in measured speech intelligibility for the separated target streams.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Acoustic Stimulation / methods
  • Algorithms
  • Auditory Pathways / physiology*
  • Computer Simulation
  • Cues*
  • Humans
  • Models, Theoretical
  • Noise / adverse effects*
  • Perceptual Masking*
  • Sound Localization*
  • Speech Intelligibility
  • Speech Perception*