Markerless 3D Skeleton Tracking Algorithm by Merging Multiple Inaccurate Skeleton Data from Multiple RGB-D Sensors

Sensors (Basel). 2022 Apr 20;22(9):3155. doi: 10.3390/s22093155.

Abstract

Skeleton data, which is often used in the HCI field, is a data structure that can efficiently express human poses and gestures because it consists of 3D positions of joints. The advancement of RGB-D sensors, such as Kinect sensors, enabled the easy capture of skeleton data from depth or RGB images. However, when tracking a target with a single sensor, there is an occlusion problem causing the quality of invisible joints to be randomly degraded. As a result, multiple sensors should be used to reliably track a target in all directions over a wide range. In this paper, we proposed a new method for combining multiple inaccurate skeleton data sets obtained from multiple sensors that capture a target from different angles into a single accurate skeleton data. The proposed algorithm uses density-based spatial clustering of applications with noise (DBSCAN) to prevent noise-added inaccurate joint candidates from participating in the merging process. After merging with the inlier candidates, we used Kalman filter to denoise the tremble error of the joint's movement. We evaluated the proposed algorithm's performance using the best view as the ground truth. In addition, the results of different sizes for the DBSCAN searching area were analyzed. By applying the proposed algorithm, the joint position accuracy of the merged skeleton improved as the number of sensors increased. Furthermore, highest performance was shown when the searching area of DBSCAN was 10 cm.

Keywords: motion capture; multiple RGB-D sensors; sensor fusion; skeleton tracking.

MeSH terms

  • Algorithms*
  • Humans
  • Movement
  • Musculoskeletal System*
  • Skeleton