3DMesh-GAR: 3D Human Body Mesh-Based Method for Group Activity Recognition

Sensors (Basel). 2022 Feb 14;22(4):1464. doi: 10.3390/s22041464.

Abstract

Group activity recognition is a prime research topic in video understanding and has many practical applications, such as crowd behavior monitoring, video surveillance, etc. To understand the multi-person/group action, the model should not only identify the individual person's action in the context but also describe their collective activity. A lot of previous works adopt skeleton-based approaches with graph convolutional networks for group activity recognition. However, these approaches are subject to limitation in scalability, robustness, and interoperability. In this paper, we propose 3DMesh-GAR, a novel approach to 3D human body Mesh-based Group Activity Recognition, which relies on a body center heatmap, camera map, and mesh parameter map instead of the complex and noisy 3D skeleton of each person of the input frames. We adopt a 3D mesh creation method, which is conceptually simple, single-stage, and bounding box free, and is able to handle highly occluded and multi-person scenes without any additional computational cost. We implement 3DMesh-GAR on a standard group activity dataset: the Collective Activity Dataset, and achieve state-of-the-art performance for group activity recognition.

Keywords: 3D human activity recognition; deep learning; feature extraction; human body mesh estimation; video understanding.

MeSH terms

  • Human Activities
  • Human Body
  • Humans
  • Neural Networks, Computer*
  • Skeleton
  • Surgical Mesh*