Are Existing Monocular Computer Vision-Based 3D Motion Capture Approaches Ready for Deployment? A Methodological Study on the Example of Alpine Skiing

Sensors (Basel). 2019 Oct 6;19(19):4323. doi: 10.3390/s19194323.

Abstract

In this study, we compared a monocular computer vision (MCV)-based approach with the golden standard for collecting kinematic data on ski tracks (i.e., video-based stereophotogrammetry) and assessed its deployment readiness for answering applied research questions in the context of alpine skiing. The investigated MCV-based approach predicted the three-dimensional human pose and ski orientation based on the image data from a single camera. The data set used for training and testing the underlying deep nets originated from a field experiment with six competitive alpine skiers. The normalized mean per joint position error of the MVC-based approach was found to be 0.08 ± 0.01m. Knee flexion showed an accuracy and precision (in parenthesis) of 0.4 ± 7.1° (7.2 ± 1.5°) for the outside leg, and -0.2 ± 5.0° (6.7 ± 1.1°) for the inside leg. For hip flexion, the corresponding values were -0.4 ± 6.1° (4.4° ± 1.5°) and -0.7 ± 4.7° (3.7 ± 1.0°), respectively. The accuracy and precision of skiing-related metrics were revealed to be 0.03 ± 0.01 m (0.01 ± 0.00 m) for relative center of mass position, -0.1 ± 3.8° (3.4 ± 0.9) for lean angle, 0.01 ± 0.03 m (0.02 ± 0.01 m) for center of mass to outside ankle distance, 0.01 ± 0.05 m (0.03 ± 0.01 m) for fore/aft position, and 0.00 ± 0.01 m2 (0.01 ± 0.00 m2) for drag area. Such magnitudes can be considered acceptable for detecting relevant differences in the context of alpine skiing.

Keywords: alpine ski racing; biomechanics; human pose estimation; markerless tracking; technical validation; video-based 3D kinematics.