Fast grid-free strength mapping of multiple sound sources from microphone array data using a Transformer architecture

J Acoust Soc Am. 2022 Nov;152(5):2543. doi: 10.1121/10.0015005.

Abstract

Conventional microphone array methods for the characterization of sound sources that require a focus-grid are, depending on the grid resolution, either computationally demanding or limited in reconstruction accuracy. This paper presents a deep learning method for grid-free source characterization using a Transformer architecture that is exclusively trained with simulated data. Unlike previous grid-free model architectures, the presented approach requires a single model to characterize an unknown number of ground-truth sources. The model predicts a set of source components, spatially arranged in clusters. Integration over the predicted cluster components allows for the determination of the strength for each ground-truth source individually. Fast and accurate source mapping performance of up to ten sources at different frequencies is demonstrated and strategies to reduce the training effort at neighboring frequencies are given. A comparison with the established grid-based CLEAN-SC and a probabilistic sparse Bayesian learning method on experimental data emphasizes the validity of the approach.