A Bayesian inference model for speech localization (L)

José Escolano; José M Perez-Lorenzo; Ning Xiang; Máximo Cobos; José J López

doi:10.1121/1.4740489

A Bayesian inference model for speech localization (L)

J Acoust Soc Am. 2012 Sep;132(3):1257-60. doi: 10.1121/1.4740489.

Authors

José Escolano¹, José M Perez-Lorenzo, Ning Xiang, Máximo Cobos, José J López

Affiliation

¹ Multimedia and Multimodal Processing Research Group, University of Jaén, 23700, Linares, Spain. escolano@ujaen.es

PMID: 22978853
DOI: 10.1121/1.4740489

Abstract

The localization of active speakers with microphone arrays is an active research line with a considerable interest in many acoustic areas. Many algorithms for source localization are based on the computation of the Generalized Cross-Correlation function between microphone pairs employing phase transform weighting. Unfortunately, the performance of these methods is severely reduced when wall reflections and multiple sound sources are present in the acoustic environment. As a result, estimating the number of active sound sources and their actual directions becomes a challenging task. To effectively tackle this problem, a Bayesian inference framework is proposed. Based on a nested sampling algorithm, a mixture model and its parameters are estimated, indicating both the number of sources-model selection-and their angle of arrival-parameter estimation, respectively. A set of measured data demonstrates the accuracy of the proposed model.

MeSH terms

Acoustics / instrumentation*
Algorithms
Bayes Theorem*
Humans
Models, Theoretical*
Motion
Signal Processing, Computer-Assisted*
Signal-To-Noise Ratio
Sound
Speech*
Time Factors
Transducers