Transfer Learning for Improved Audio-Based Human Activity Recognition

Biosensors (Basel). 2018 Jun 25;8(3):60. doi: 10.3390/bios8030060.

Abstract

Human activities are accompanied by characteristic sound events, the processing of which might provide valuable information for automated human activity recognition. This paper presents a novel approach addressing the case where one or more human activities are associated with limited audio data, resulting in a potentially highly imbalanced dataset. Data augmentation is based on transfer learning; more specifically, the proposed method: (a) identifies the classes which are statistically close to the ones associated with limited data; (b) learns a multiple input, multiple output transformation; and (c) transforms the data of the closest classes so that it can be used for modeling the ones associated with limited data. Furthermore, the proposed framework includes a feature set extracted out of signal representations of diverse domains, i.e., temporal, spectral, and wavelet. Extensive experiments demonstrate the relevance of the proposed data augmentation approach under a variety of generative recognition schemes.

Keywords: echo state network; generalized audio recognition; hidden Markov model; multidomain features; transfer learning.

MeSH terms

  • Algorithms
  • Human Activities*
  • Humans
  • Markov Chains
  • Pattern Recognition, Automated
  • Sound*