A Parallel Classification Model for Marine Mammal Sounds Based on Multi-Dimensional Feature Extraction and Data Augmentation

Sensors (Basel). 2022 Sep 30;22(19):7443. doi: 10.3390/s22197443.

Abstract

Due to the poor visibility of the deep-sea environment, acoustic signals are often collected and analyzed to explore the behavior of marine species. With the progress of underwater signal-acquisition technology, the amount of acoustic data obtained from the ocean has exceeded the limit that human can process manually, so designing efficient marine-mammal classification algorithms has become a research hotspot. In this paper, we design a classification model based on a multi-channel parallel structure, which can process multi-dimensional acoustic features extracted from audio samples, and fuse the prediction results of different channels through a trainable full connection layer. It uses transfer learning to obtain faster convergence speed, and introduces data augmentation to improve the classification accuracy. The k-fold cross-validation method was used to segment the data set to comprehensively evaluate the prediction accuracy and robustness of the model. The evaluation results showed that the model can achieve a mean accuracy of 95.21% while maintaining a standard deviation of 0.65%. There was excellent consistency in performance over multiple tests.

Keywords: convolutional neural network; data augmentation; feature extraction; marine mammal classification; transfer learning.

MeSH terms

  • Acoustics
  • Algorithms*
  • Humans
  • Neural Networks, Computer*
  • Sound

Grants and funding

This research was partially supported by Natural Science Foundation of Zhejiang Province (No.LZ22F010004 and LZJWY22E090001), the Fundamental Research Funds for the Provincial Universities of Zhejiang (No.GK209907299001-001), the National Natural Science Foundation of China (No.61871163 and No.61801431), and the Stable Supporting Fund of Acoustics Science and Technology Laboratory.