Human Motion Enhancement via Tobit Kalman Filter-Assisted Autoencoder

IEEE Access. 2022:10:29233-29251. doi: 10.1109/access.2022.3157605. Epub 2022 Mar 8.

Abstract

We present a novel approach to enhance the quality of human motion data collected by low-cost depth sensors, namely D-Mocap, which suffers from low accuracy and poor stability due to occlusion, interference, and algorithmic limitations. Our approach takes advantage of a large set of high-quality and diverse Mocap data by learning a general motion manifold via the convolutional autoencoder. In addition, the Tobit Kalman filter (TKF) is used to capture the kinematics of each body joint and handle censored measurement distribution. The TKF is incorporated with the autoencoder via latent space optimization, maintaining adherence to the motion manifold while preserving the kinematic nature of the original motion data. Furthermore, due to the lack of an open source benchmark dataset for this research, we have developed an extension of the Berkeley Multimodal Human Action Database (MHAD) by generating D-Mocap data from RGB-D images. The newly extended MHAD dataset is skeleton-matched and time-synced to the corresponding Mocap data and is publicly available. Along with simulated D-Mocap data generated from the CMU Mocap dataset and our self-collected D-Mocap dataset, the proposed algorithm is thoroughly evaluated and compared with different settings. Experimental results show that our approach can improve the accuracy of joint positions and angles as well as skeletal bone lengths by over 50%.

Keywords: Autoencoder; Tobit Kalman filter; depth sensors; human motion manifold; motion capture.