Dual-Channel Cosine Function Based ITD Estimation for Robust Speech Separation

Sensors (Basel). 2017 Jun 20;17(6):1447. doi: 10.3390/s17061447.

Abstract

In speech separation tasks, many separation methods have the limitation that the microphones are closely spaced, which means that these methods are unprevailing for phase wrap-around. In this paper, we present a novel speech separation scheme by using two microphones that does not have this restriction. The technique utilizes the estimation of interaural time difference (ITD) statistics and binary time-frequency mask for the separation of mixed speech sources. The novelties of the paper consist in: (1) the extended application of delay-and-sum beamforming (DSB) and cosine function for ITD calculation; and (2) the clarification of the connection between ideal binary mask and DSB amplitude ratio. Our objective quality evaluation experiments demonstrate the effectiveness of the proposed method.

Keywords: binary time-frequency mask; cosine function; delay-and-sum beamforming.

MeSH terms

  • Humans
  • Speech
  • Speech Perception*