Recognition of Human Activities Using Depth Maps and the Viewpoint Feature Histogram Descriptor

Sensors (Basel). 2020 May 22;20(10):2940. doi: 10.3390/s20102940.

Abstract

In this paper we propose a way of using depth maps transformed into 3D point clouds to classify human activities. The activities are described as time sequences of feature vectors based on the Viewpoint Feature Histogram descriptor (VFH) computed using the Point Cloud Library. Recognition is performed by two types of classifiers: (i) k-NN nearest neighbors' classifier with Dynamic Time Warping measure, (ii) bidirectional long short-term memory (BiLSTM) deep learning networks. Reduction of classification time for the k-NN by introducing a two tier model and improvement of BiLSTM-based classification via transfer learning and combining multiple networks by fuzzy integral are discussed. Our classification results obtained on two representative datasets: University of Texas at Dallas Multimodal Human Action Dataset and Mining Software Repositories Action 3D Dataset are comparable or better than the current state of the art.

Keywords: BiLSTM; VFH descriptor; activity recognition; dynamic time warping; multiple network fusion; point clouds; transfer learning.

MeSH terms

  • Datasets as Topic
  • Deep Learning*
  • Human Activities*
  • Humans
  • Imaging, Three-Dimensional*