Capturing the songs of mice with an improved detection and classification method for ultrasonic vocalizations (BootSnap)

PLoS Comput Biol. 2022 May 12;18(5):e1010049. doi: 10.1371/journal.pcbi.1010049. eCollection 2022 May.

Abstract

House mice communicate through ultrasonic vocalizations (USVs), which are above the range of human hearing (>20 kHz), and several automated methods have been developed for USV detection and classification. Here we evaluate their advantages and disadvantages in a full, systematic comparison, while also presenting a new approach. This study aims to 1) determine the most efficient USV detection tool among the existing methods, and 2) develop a classification model that is more generalizable than existing methods. In both cases, we aim to minimize the user intervention required for processing new data. We compared the performance of four detection methods in an out-of-the-box approach, pretrained DeepSqueak detector, MUPET, USVSEG, and the Automatic Mouse Ultrasound Detector (A-MUD). We also compared these methods to human visual or 'manual' classification (ground truth) after assessing its reliability. A-MUD and USVSEG outperformed the other methods in terms of true positive rates using default and adjusted settings, respectively, and A-MUD outperformed USVSEG when false detection rates were also considered. For automating the classification of USVs, we developed BootSnap for supervised classification, which combines bootstrapping on Gammatone Spectrograms and Convolutional Neural Networks algorithms with Snapshot ensemble learning. It successfully classified calls into 12 types, including a new class of false positives that is useful for detection refinement. BootSnap outperformed the pretrained and retrained state-of-the-art tool, and thus it is more generalizable. BootSnap is freely available for scientific use.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Mice
  • Neural Networks, Computer
  • Reproducibility of Results
  • Vocalization, Animal*

Grants and funding

This work was supported by the FWF P 34922-N project (“NoMASP: Nonsmooth Nonconvex Optimization Methods for Acoustic Signal Processing”) to PB and by a grant (FWF P 28141-B25) of the Austrian Science Foundation (http://www.fwf.ac.at) to DJP and SMZ. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.