Estimation of 6D Object Pose Using a 2D Bounding Box

Yong Hong; Jin Liu; Zahid Jahangir; Sheng He; Qing Zhang

doi:10.3390/s21092939

Estimation of 6D Object Pose Using a 2D Bounding Box

Sensors (Basel). 2021 Apr 22;21(9):2939. doi: 10.3390/s21092939.

Authors

Yong Hong¹, Jin Liu¹, Zahid Jahangir¹, Sheng He¹, Qing Zhang²

Affiliations

¹ State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China.
² College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China.

Abstract

This paper provides an efficient way of addressing the problem of detecting or estimating the 6-Dimensional (6D) pose of objects from an RGB image. A quaternion is used to define an object's three-dimensional pose, but the pose represented by q and the pose represented by -q are equivalent, and the L2 loss between them is very large. Therefore, we define a new quaternion pose loss function to solve this problem. Based on this, we designed a new convolutional neural network named Q-Net to estimate an object's pose. Considering that the quaternion's output is a unit vector, a normalization layer is added in Q-Net to hold the output of pose on a four-dimensional unit sphere. We propose a new algorithm, called the Bounding Box Equation, to obtain 3D translation quickly and effectively from 2D bounding boxes. The algorithm uses an entirely new way of assessing the 3D rotation (R) and 3D translation rotation (t) in only one RGB image. This method can upgrade any traditional 2D-box prediction algorithm to a 3D prediction model. We evaluated our model using the LineMod dataset, and experiments have shown that our methodology is more acceptable and efficient in terms of L2 loss and computational time.

Keywords: 6D pose estimation; Bounding Box Equation; LineMod; quaternion.

Grants and funding

41271454/National Natural Science Foundation of China