Self-supervised acoustic representation learning via acoustic-embedding memory unit modified space autoencoder for underwater target recognition

J Acoust Soc Am. 2022 Nov;152(5):2905. doi: 10.1121/10.0015138.

Abstract

Since the expensive annotation of high-quality signals obtained from passive sonars and the weak generalization ability of the single feature in the ocean, this paper proposes the self-supervised acoustic representation learning under acoustic-embedding memory unit modified space autoencoder (ASAE) and performs the underwater target recognition task. In the manner of the animal-like acoustic auditory system, the first step is to design a self-supervised representation learning method called space autoencoder (SAE) to merge Mel filter-bank (FBank) with the acoustic discrimination and gammatone filter-bank (GBank) with the anti-noise robustness into SAE spectrogram (SAE Spec). Meanwhile, due to poor high-level semantic information in SAE Spec, an acoustic-embedding memory unit (AEMU) is introduced as the strategy of adversarial enhancement. During the auxiliary task, more negative samples are joined in the improved contrastive loss function to obtain adversarial enhanced features called ASAE spectrogram (ASAE Spec). Ultimately, the comprehensive contrast experiments and ablation experiments on two underwater datasets show that ASAE Spec increases by more than 0.96% in accuracy, convergence rate, and anti-noise robustness of other mainstream acoustic features. The results prove the potential value of ASAE in practical applications.