C-MHAD: Continuous Multimodal Human Action Dataset of Simultaneous Video and Inertial Sensing

Sensors (Basel). 2020 May 20;20(10):2905. doi: 10.3390/s20102905.

Abstract

Existing public domain multi-modal datasets for human action recognition only include actions of interest that have already been segmented from action streams. These datasets cannot be used to study a more realistic action recognition scenario where actions of interest occur randomly and continuously among actions of non-interest or no actions. It is more challenging to recognize actions of interest in continuous action streams since the starts and ends of these actions are not known and need to be determined in an on-the-fly manner. Furthermore, there exists no public domain multi-modal dataset in which video and inertial data are captured simultaneously for continuous action streams. The main objective of this paper is to describe a dataset that is collected and made publicly available, named Continuous Multimodal Human Action Dataset (C-MHAD), in which video and inertial data stream are captured simultaneously in a continuous way. This dataset is then used in an example recognition technique and the results obtained indicate that the fusion of these two sensing modalities increases the F1 scores compared to using each sensing modality individually.

Keywords: fusion of video and inertial sensing for action recognition; public domain dataset for multi-modal action recognition; recognition in continuous action streams.

MeSH terms

  • Algorithms*
  • Datasets as Topic*
  • Human Activities*
  • Humans
  • Video Recording*