Binaural Acoustic Scene Classification Using Wavelet Scattering, Parallel Ensemble Classifiers and Nonlinear Fusion

Sensors (Basel). 2022 Feb 16;22(4):1535. doi: 10.3390/s22041535.

Abstract

The analysis of ambient sounds can be very useful when developing sound base intelligent systems. Acoustic scene classification (ASC) is defined as identifying the area of a recorded sound or clip among some predefined scenes. ASC has huge potential to be used in urban sound event classification systems. This research presents a hybrid method that includes a novel mathematical fusion step which aims to tackle the challenges of ASC accuracy and adaptability of current state-of-the-art models. The proposed method uses a stereo signal, two ensemble classifiers (random subspace), and a novel mathematical fusion step. In the proposed method, a stable, invariant signal representation of the stereo signal is built using Wavelet Scattering Transform (WST). For each mono, i.e., left and right, channel, a different random subspace classifier is trained using WST. A novel mathematical formula for fusion step was developed, its parameters being found using a Genetic algorithm. The results on the DCASE 2017 dataset showed that the proposed method has higher classification accuracy (about 95%), pushing the boundaries of existing methods.

Keywords: genetic algorithm; machine learning; sound base intelligent system; stereo signal; urban sounds classification.

MeSH terms

  • Acoustics*
  • Algorithms
  • Sound
  • Wavelet Analysis*