The bag-of-frames approach: A not so sufficient model for urban soundscapes

J Acoust Soc Am. 2015 Nov;138(5):EL487-92. doi: 10.1121/1.4935350.

Abstract

The "bag-of-frames" (BOF) approach, which encodes audio signals as the long-term statistical distribution of short-term spectral features, is commonly regarded as an effective and sufficient way to represent environmental sound recordings (soundscapes). The present paper describes a conceptual replication of a use of the BOF approach in a seminal article using several other soundscape datasets, with results strongly questioning the adequacy of the BOF approach for the task. As demonstrated in this paper, the good accuracy originally reported with BOF likely resulted from a particularly permissive dataset with low within-class variability. Soundscape modeling, therefore, may not be the closed case it was once thought to be.

Publication types

  • Research Support, Non-U.S. Gov't