Self-supervised acoustic representation learning via acoustic-embedding memory unit modified space autoencoder for underwater target recognition

Xingmei Wang; Jiaxiang Meng; Yangtao Liu; Ge Zhan; Zhaonan Tian

doi:10.1121/10.0015138

Self-supervised acoustic representation learning via acoustic-embedding memory unit modified space autoencoder for underwater target recognition

J Acoust Soc Am. 2022 Nov;152(5):2905. doi: 10.1121/10.0015138.

Authors

Xingmei Wang¹, Jiaxiang Meng¹, Yangtao Liu¹, Ge Zhan¹, Zhaonan Tian¹

Affiliation

¹ College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China.

PMID: 36456286
DOI: 10.1121/10.0015138

Abstract

Since the expensive annotation of high-quality signals obtained from passive sonars and the weak generalization ability of the single feature in the ocean, this paper proposes the self-supervised acoustic representation learning under acoustic-embedding memory unit modified space autoencoder (ASAE) and performs the underwater target recognition task. In the manner of the animal-like acoustic auditory system, the first step is to design a self-supervised representation learning method called space autoencoder (SAE) to merge Mel filter-bank (FBank) with the acoustic discrimination and gammatone filter-bank (GBank) with the anti-noise robustness into SAE spectrogram (SAE Spec). Meanwhile, due to poor high-level semantic information in SAE Spec, an acoustic-embedding memory unit (AEMU) is introduced as the strategy of adversarial enhancement. During the auxiliary task, more negative samples are joined in the improved contrastive loss function to obtain adversarial enhanced features called ASAE spectrogram (ASAE Spec). Ultimately, the comprehensive contrast experiments and ablation experiments on two underwater datasets show that ASAE Spec increases by more than 0.96% in accuracy, convergence rate, and anti-noise robustness of other mainstream acoustic features. The results prove the potential value of ASAE in practical applications.