Sensor-Based Human Activity Recognition with Spatio-Temporal Deep Learning

Sensors (Basel). 2021 Mar 18;21(6):2141. doi: 10.3390/s21062141.

Abstract

Human activity recognition (HAR) remains a challenging yet crucial problem to address in computer vision. HAR is primarily intended to be used with other technologies, such as the Internet of Things, to assist in healthcare and eldercare. With the development of deep learning, automatic high-level feature extraction has become a possibility and has been used to optimize HAR performance. Furthermore, deep-learning techniques have been applied in various fields for sensor-based HAR. This study introduces a new methodology using convolution neural networks (CNN) with varying kernel dimensions along with bi-directional long short-term memory (BiLSTM) to capture features at various resolutions. The novelty of this research lies in the effective selection of the optimal video representation and in the effective extraction of spatial and temporal features from sensor data using traditional CNN and BiLSTM. Wireless sensor data mining (WISDM) and UCI datasets are used for this proposed methodology in which data are collected through diverse methods, including accelerometers, sensors, and gyroscopes. The results indicate that the proposed scheme is efficient in improving HAR. It was thus found that unlike other available methods, the proposed method improved accuracy, attaining a higher score in the WISDM dataset compared to the UCI dataset (98.53% vs. 97.05%).

Keywords: Bi-directional LSTM; convolution neural networks; deep learning; human activity recognition; local spatio-temporal features.

MeSH terms

  • Data Mining
  • Deep Learning*
  • Human Activities
  • Humans
  • Memory, Long-Term
  • Neural Networks, Computer