Acoustic model adaptation for ortolan bunting (Emberiza hortulana L.) song-type classification

Jidong Tao; Michael T Johnson; Tomasz S Osiejuk

doi:10.1121/1.2837487

Acoustic model adaptation for ortolan bunting (Emberiza hortulana L.) song-type classification

J Acoust Soc Am. 2008 Mar;123(3):1582-90. doi: 10.1121/1.2837487.

Authors

Jidong Tao¹, Michael T Johnson, Tomasz S Osiejuk

Affiliation

¹ Speech and Signal Processing Laboratory, Marquette University, PO Box 1881, Milwaukee, Wisconsin 53233-1881, USA. vjdtao@hotmail.com

PMID: 18345846
DOI: 10.1121/1.2837487

Abstract

Automatic systems for vocalization classification often require fairly large amounts of data on which to train models. However, animal vocalization data collection and transcription is a difficult and time-consuming task, so that it is expensive to create large data sets. One natural solution to this problem is the use of acoustic adaptation methods. Such methods, common in human speech recognition systems, create initial models trained on speaker independent data, then use small amounts of adaptation data to build individual-specific models. Since, as in human speech, individual vocal variability is a significant source of variation in bioacoustic data, acoustic model adaptation is naturally suited to classification in this domain as well. To demonstrate and evaluate the effectiveness of this approach, this paper presents the application of maximum likelihood linear regression adaptation to ortolan bunting (Emberiza hortulana L.) song-type classification. Classification accuracies for the adapted system are computed as a function of the amount of adaptation data and compared to caller-independent and caller-dependent systems. The experimental results indicate that given the same amount of data, supervised adaptation significantly outperforms both caller-independent and caller-dependent systems.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Acoustics*
Adaptation, Physiological* / physiology
Animals
Birds
Models, Biological*
Signal Detection, Psychological
Vocalization, Animal / classification*