A Deep Learning Approach for Human Action Recognition Using Skeletal Information

Adv Exp Med Biol. 2020:1194:105-114. doi: 10.1007/978-3-030-32622-7_9.

Abstract

In this paper we present an approach toward human action detection for activities of daily living (ADLs) that uses a convolutional neural network (CNN). The network is trained on discrete Fourier transform (DFT) images that result from raw sensor readings, i.e., each human action is ultimately described by an image. More specifically, we work using 3D skeletal positions of human joints, which originate from processing of raw RGB sequences enhanced by depth information. The motion of each joint may be described by a combination of three 1D signals, representing its coefficients into a 3D Euclidean space. All such signals from a set of human joints are concatenated to form an image, which is then transformed by DFT and is used for training and evaluation of a CNN. We evaluate our approach using a publicly available challenging dataset of human actions that may involve one or more body parts simultaneously and for two sets of actions which resemble common ADLs.

Keywords: Activities of daily living; Convolutional neural networks; Human action recognition.

MeSH terms

  • Activities of Daily Living*
  • Bone and Bones* / diagnostic imaging
  • Deep Learning*
  • Humans
  • Joints* / diagnostic imaging
  • Neural Networks, Computer
  • Range of Motion, Articular*