Separation of overlapping sources in bioacoustic mixtures

J Acoust Soc Am. 2020 Mar;147(3):1688. doi: 10.1121/10.0000932.

Abstract

Source separation is an important step to study signals that are not easy or possible to record individually. Common methods such as deep clustering, however, cannot be applied to signals of an unknown number of sources and/or signals that overlap in time and/or frequency-a common problem in bioacoustic recordings. This work presents an approach, using a supervised learning framework, to parse individual sources from a spectrogram of a mixture that contains a variable number of overlapping sources. This method isolates individual sources in the time-frequency domain using only one function but in two separate steps, one for the detection of the number of sources and corresponding bounding boxes, and a second step for the segmentation in which masks of individual sounds are extracted. This approach handles the full separation of overlapping sources in both time and frequency using deep neural networks in an applicable manner to other tasks such as bird audio detection. This paper presents method and reports on its performance to parse individual bat signals from recordings containing hundreds of overlapping bat echolocation signals. This method can be extended to other bioacoustic recordings with a variable number of sources and signals that overlap in time and/or frequency.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Echolocation*
  • Neural Networks, Computer
  • Sound