Who is calling? Optimizing source identification from marmoset vocalizations with hierarchical machine learning classifiers

J R Soc Interface. 2023 Oct;20(207):20230399. doi: 10.1098/rsif.2023.0399. Epub 2023 Oct 18.

Abstract

With their highly social nature and complex vocal communication system, marmosets are important models for comparative studies of vocal communication and, eventually, language evolution. However, our knowledge about marmoset vocalizations predominantly originates from playback studies or vocal interactions between dyads, and there is a need to move towards studying group-level communication dynamics. Efficient source identification from marmoset vocalizations is essential for this challenge, and machine learning algorithms (MLAs) can aid it. Here we built a pipeline capable of plentiful feature extraction, meaningful feature selection, and supervised classification of vocalizations of up to 18 marmosets. We optimized the classifier by building a hierarchical MLA that first learned to determine the sex of the source, narrowed down the possible source individuals based on their sex and then determined the source identity. We were able to correctly identify the source individual with high precisions (87.21%-94.42%, depending on call type, and up to 97.79% after the removal of twins from the dataset). We also examine the robustness of identification across varying sample sizes. Our pipeline is a promising tool not only for source identification from marmoset vocalizations but also for analysing vocalizations of other species.

Keywords: bioacoustics; hierarchical classifier; machine learning; marmoset calls; source identification; time series analysis.

MeSH terms

  • Animals
  • Callithrix*
  • Deep Learning*
  • Humans
  • Language
  • Machine Learning
  • Vocalization, Animal