End-to-end sound field reproduction based on deep learning

Xi Hong; Bokai Du; Shuang Yang; Menghui Lei; Xiangyang Zeng

doi:10.1121/10.0019575

End-to-end sound field reproduction based on deep learning

J Acoust Soc Am. 2023 May 1;153(5):3055. doi: 10.1121/10.0019575.

Authors

Xi Hong¹, Bokai Du², Shuang Yang¹, Menghui Lei¹, Xiangyang Zeng¹

Affiliations

¹ School of Marine Science and Technology, Northwestern Polytechnical University, Xi'An, 710072, China.
² Aircraft Strength Research Institute, Xi'An, 710065, China.

PMID: 37219493
DOI: 10.1121/10.0019575

Abstract

Sound field reproduction, which attempts to create a virtual acoustic environment, is a fundamental technology in the achievement of virtual reality. In sound field reproduction, the driving signals of the loudspeakers are calculated by considering the signals collected by the microphones and working environment of the reproduction system. In this paper, an end-to-end reproduction method based on deep learning is proposed. The inputs and outputs of this system are the sound-pressure signals recorded by microphones and the driving signals of loudspeakers, respectively. A convolutional autoencoder network with skip connections in the frequency domain is used. Furthermore, sparse layers are applied to capture the sparse features of the sound field. Simulation results show that the reproduction errors of the proposed method are lower than those generated by the conventional pressure matching and least absolute shrinkage and selection operator methods, especially at high frequencies. Experiments were performed under conditions of single and multiple primary sources. The results in both cases demonstrate that the proposed method achieves better high-frequency performance than the conventional methods.