Comparison of a Deep Learning-Based Pose Estimation System to Marker-Based and Kinect Systems in Exergaming for Balance Training

Elise Klæbo Vonstad; Xiaomeng Su; Beatrix Vereijken; Kerstin Bach; Jan Harald Nilsen

doi:10.3390/s20236940

Comparison of a Deep Learning-Based Pose Estimation System to Marker-Based and Kinect Systems in Exergaming for Balance Training

Sensors (Basel). 2020 Dec 4;20(23):6940. doi: 10.3390/s20236940.

Authors

Elise Klæbo Vonstad¹, Xiaomeng Su¹, Beatrix Vereijken², Kerstin Bach¹, Jan Harald Nilsen¹

Affiliations

¹ Department of Computer Science, Norwegian University of Science and Technology, 7034 Trondheim, Norway.
² Department of Neuromedicine and Movement Science, Norwegian University of Science and Technology, 7030 Trondheim, Norway.

Abstract

Using standard digital cameras in combination with deep learning (DL) for pose estimation is promising for the in-home and independent use of exercise games (exergames). We need to investigate to what extent such DL-based systems can provide satisfying accuracy on exergame relevant measures. Our study assesses temporal variation (i.e., variability) in body segment lengths, while using a Deep Learning image processing tool (DeepLabCut, DLC) on two-dimensional (2D) video. This variability is then compared with a gold-standard, marker-based three-dimensional Motion Capturing system (3DMoCap, Qualisys AB), and a 3D RGB-depth camera system (Kinect V2, Microsoft Inc). Simultaneous data were collected from all three systems, while participants (N = 12) played a custom balance training exergame. The pose estimation DLC-model is pre-trained on a large-scale dataset (ImageNet) and optimized with context-specific pose annotated images. Wilcoxon's signed-rank test was performed in order to assess the statistical significance of the differences in variability between systems. The results showed that the DLC method performs comparably to the Kinect and, in some segments, even to the 3DMoCap gold standard system with regard to variability. These results are promising for making exergames more accessible and easier to use, thereby increasing their availability for in-home exercise.

Keywords: deep learning; exergaming; human movement; image analysis; kinect; markerless motion capture; motion capture; segment lengths.

MeSH terms

Deep Learning*
Exercise*
Games, Recreational
Humans
Motion
Postural Balance*