Robust Vision-Based Workout Analysis Using Diversified Deep Latent Variable Model

Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul:2020:2155-2158. doi: 10.1109/EMBC44109.2020.9175454.

Abstract

Exercising has various health benefits and it has become an integral part of the contemporary lifestyle. However, some workouts are complex and require a trainer to demonstrate their steps. Thus, there are various workout video tutorials available online. Having access to these, people are able to independently learn to perform these workouts by imitating the poses of the trainer in the tutorial. However, people may injure themselves if not performing the workout steps accurately. Therefore, previous work suggested to provide visual feedback to users by detecting 2D skeletons of both the trainer and the learner, and then using the detected skeletons for pose accuracy estimation. Using 2D skeletons for comparison may be unreliable, due to the highly variable body shapes, which complicate their alignment and pose accuracy estimation. To address this challenge, we propose to estimate 3D rather than 2D skeletons and then measure the differences between the joint angles of the 3D skeletons. Leveraging recent advancements in deep latent variable models, we are able to estimate 3D skeletons from videos. Furthermore, a positive-definite kernel based on diversity-encouraging prior is introduced to provide a more accurate pose estimation. Experimental results show the superiority of our proposed 3D pose estimation over the state-of-the-art baselines.

Publication types

  • Research Support, Non-U.S. Gov't