A Convolutional Neural-Network-Based Training Model to Estimate Actual Distance of Persons in Continuous Images

Sensors (Basel). 2022 Aug 1;22(15):5743. doi: 10.3390/s22155743.

Abstract

Distance and depth detection plays a crucial role in intelligent robotics. It enables drones to understand their working environment to avoid collisions and accidents immediately and is very important in various AI applications. Image-based distance detection usually relies on the correctness of geometric information. However, the geometric features will be lost when the object is rotated or the camera lens image is distorted. This study proposes a training model based on a convolutional neural network, which uses a single-lens camera to estimate humans' distance in continuous images. We can partially restore depth information loss using built-in camera parameters that do not require additional correction. The normalized skeleton feature unit vector has the same characteristics as time series data and can be classified very well using a 1D convolutional neural network. According to our results, the accuracy for the occluded leg image is over 90% at 2 to 3 m, 80% to 90% at 4 m, and 70% at 5 to 6 m.

Keywords: OpenPose; UAV application; deep learning; human skeletons; occluded human images; rotation.

MeSH terms

  • Humans
  • Neural Networks, Computer*
  • Robotics*