PROTAX-Sound: A probabilistic framework for automated animal sound identification

PLoS One. 2017 Sep 1;12(9):e0184048. doi: 10.1371/journal.pone.0184048. eCollection 2017.

Abstract

Autonomous audio recording is stimulating new field in bioacoustics, with a great promise for conducting cost-effective species surveys. One major current challenge is the lack of reliable classifiers capable of multi-species identification. We present PROTAX-Sound, a statistical framework to perform probabilistic classification of animal sounds. PROTAX-Sound is based on a multinomial regression model, and it can utilize as predictors any kind of sound features or classifications produced by other existing algorithms. PROTAX-Sound combines audio and image processing techniques to scan environmental audio files. It identifies regions of interest (a segment of the audio file that contains a vocalization to be classified), extracts acoustic features from them and compares with samples in a reference database. The output of PROTAX-Sound is the probabilistic classification of each vocalization, including the possibility that it represents species not present in the reference database. We demonstrate the performance of PROTAX-Sound by classifying audio from a species-rich case study of tropical birds. The best performing classifier achieved 68% classification accuracy for 200 bird species. PROTAX-Sound improves the classification power of current techniques by combining information from multiple classifiers in a manner that yields calibrated classification probabilities.

MeSH terms

  • Acoustics*
  • Algorithms
  • Animals
  • Bayes Theorem
  • Birds / physiology*
  • Probability
  • Regression Analysis
  • Reproducibility of Results
  • Signal Processing, Computer-Assisted*
  • Sound Spectrography
  • Sound*
  • Vocalization, Animal / classification*

Grants and funding

The research was funded by the Academy of Finland (grants 1273253 and 284601 to OO), the Research Council of Norway (CoE grant 223257), and the LUOVA graduate school of the University of Helsinki (PhD grant for UC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.