WPO-Net: Windowed Pose Optimization Network for Monocular Visual Odometry Estimation

Sensors (Basel). 2021 Dec 6;21(23):8155. doi: 10.3390/s21238155.

Abstract

Visual odometry is the process of estimating incremental localization of the camera in 3-dimensional space for autonomous driving. There have been new learning-based methods which do not require camera calibration and are robust to external noise. In this work, a new method that do not require camera calibration called the "windowed pose optimization network" is proposed to estimate the 6 degrees of freedom pose of a monocular camera. The architecture of the proposed network is based on supervised learning-based methods with feature encoder and pose regressor that takes multiple consecutive two grayscale image stacks at each step for training and enforces the composite pose constraints. The KITTI dataset is used to evaluate the performance of the proposed method. The proposed method yielded rotational error of 3.12 deg/100 m, and the training time is 41.32 ms, while inference time is 7.87 ms. Experiments demonstrate the competitive performance of the proposed method to other state-of-the-art related works which shows the novelty of the proposed technique.

Keywords: deep learning; pose estimation; pose optimization; visual odometry.

MeSH terms

  • Automobile Driving*
  • Calibration