Learning to localize sounds in a highly reverberant environment: Machine-learning tracking of dolphin whistle-like sounds in a pool

Sean F Woodward; Diana Reiss; Marcelo O Magnasco

doi:10.1371/journal.pone.0235155

Learning to localize sounds in a highly reverberant environment: Machine-learning tracking of dolphin whistle-like sounds in a pool

PLoS One. 2020 Jun 25;15(6):e0235155. doi: 10.1371/journal.pone.0235155. eCollection 2020.

Authors

Sean F Woodward¹, Diana Reiss², Marcelo O Magnasco¹

Affiliations

¹ Laboratory of Integrative Neuroscience, Center for Studies in Physics and Biology, The Rockefeller University, New York, NY, United States of America.
² Department of Psychology, Hunter College, City University of New York, New York, NY, United States of America.

Abstract

Tracking the origin of propagating wave signals in an environment with complex reflective surfaces is, in its full generality, a nearly intractable problem which has engendered multiple domain-specific literatures. We posit that, if the environment and sensor geometries are fixed, machine learning algorithms can "learn" the acoustical geometry of the environment and accurately track signal origin. In this paper, we propose the first machine-learning-based approach to identifying the source locations of semi-stationary, tonal, dolphin-whistle-like sounds in a highly reverberant space, specifically a half-cylindrical dolphin pool. Our algorithm works by supplying a learning network with an overabundance of location "clues", which are then selected under supervised training for their ability to discriminate source location in this particular environment. More specifically, we deliver estimated time-difference-of-arrivals (TDOA's) and normalized cross-correlation values computed from pairs of hydrophone signals to a random forest model for high-feature-volume classification and feature selection, and subsequently deliver the selected features into linear discriminant analysis, linear and quadratic Support Vector Machine (SVM), and Gaussian process models. Based on data from 14 sound source locations and 16 hydrophones, our classification models yielded perfect accuracy at predicting novel sound source locations. Our regression models yielded better accuracy than the established Steered-Response Power (SRP) method when all training data were used, and comparable accuracy along the pool surface when deprived of training data at testing sites; our methods additionally boast improved computation time and the potential for superior localization accuracy in all dimensions with more training data. Because of the generality of our method we argue it may be useful in a much wider variety of contexts.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Acoustics*
Animals
Dolphins / physiology*
Machine Learning*
Vocalization, Animal*

Associated data

figshare/10.6084/m9.figshare.7956212

Grants and funding

MO Magnasco, DR Reiss Awards 1530544, 1607280 National Science Foundation https://www.nsf.gov The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.