PAR-Net: An Enhanced Dual-Stream CNN-ESN Architecture for Human Physical Activity Recognition

Sensors (Basel). 2024 Mar 16;24(6):1908. doi: 10.3390/s24061908.

Abstract

Physical exercise affects many facets of life, including mental health, social interaction, physical fitness, and illness prevention, among many others. Therefore, several AI-driven techniques have been developed in the literature to recognize human physical activities. However, these techniques fail to adequately learn the temporal and spatial features of the data patterns. Additionally, these techniques are unable to fully comprehend complex activity patterns over different periods, emphasizing the need for enhanced architectures to further increase accuracy by learning spatiotemporal dependencies in the data individually. Therefore, in this work, we develop an attention-enhanced dual-stream network (PAR-Net) for physical activity recognition with the ability to extract both spatial and temporal features simultaneously. The PAR-Net integrates convolutional neural networks (CNNs) and echo state networks (ESNs), followed by a self-attention mechanism for optimal feature selection. The dual-stream feature extraction mechanism enables the PAR-Net to learn spatiotemporal dependencies from actual data. Furthermore, the incorporation of a self-attention mechanism makes a substantial contribution by facilitating targeted attention on significant features, hence enhancing the identification of nuanced activity patterns. The PAR-Net was evaluated on two benchmark physical activity recognition datasets and achieved higher performance by surpassing the baselines comparatively. Additionally, a thorough ablation study was conducted to determine the best optimal model for human physical activity recognition.

Keywords: deep learning; echo state networks; machine learning; physical activity recognition; skeleton data.

MeSH terms

  • Exercise
  • Human Activities
  • Humans
  • Machine Learning*
  • Neural Networks, Computer*
  • Recognition, Psychology

Grants and funding

This research was supported by the Ministry of Science and ICT (MSIT), Korea, under the Information Technology Research Center support program (IITP-2023-RS-2023-00156354) supervised by the Institute for Information and Communications Technology Planning and Evaluation (IITP). It was also supported by the IITP grant under the metaverse support program to nurture the best talents (IITP-2023-RS-2023-00254529), funded by the Korean government (MSIT).