Graph Convolutional Network for 3D Object Pose Estimation in a Point Cloud

Tae-Won Jung; Chi-Seo Jeong; In-Seon Kim; Min-Su Yu; Soon-Chul Kwon; Kye-Dong Jung

doi:10.3390/s22218166

Graph Convolutional Network for 3D Object Pose Estimation in a Point Cloud

Sensors (Basel). 2022 Oct 25;22(21):8166. doi: 10.3390/s22218166.

Authors

Tae-Won Jung¹, Chi-Seo Jeong², In-Seon Kim², Min-Su Yu², Soon-Chul Kwon², Kye-Dong Jung³

Affiliations

¹ Department of Immersive Content Convergence, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, Korea.
² Department of Smart Convergence, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, Korea.
³ Ingenium College of Liberal Arts, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, Korea.

Abstract

Graph Neural Networks (GNNs) are neural networks that learn the representation of nodes and associated edges that connect it to every other node while maintaining graph representation. Graph Convolutional Neural Networks (GCNs), as a representative method in GNNs, in the context of computer vision, utilize conventional Convolutional Neural Networks (CNNs) to process data supported by graphs. This paper proposes a one-stage GCN approach for 3D object detection and poses estimation by structuring non-linearly distributed points of a graph. Our network provides the required details to analyze, generate and estimate bounding boxes by spatially structuring the input data into graphs. Our method proposes a keypoint attention mechanism that aggregates the relative features between each point to estimate the category and pose of the object to which the vertices of the graph belong, and also designs nine degrees of freedom of multi-object pose estimation. In addition, to avoid gimbal lock in 3D space, we use quaternion rotation, instead of Euler angle. Experimental results showed that memory usage and efficiency could be improved by aggregating point features from the point cloud and their neighbors in a graph structure. Overall, the system achieved comparable performance against state-of-the-art systems.

Keywords: graph convolutional network; graph neural network; one-stage detection method; three-dimensional object detection; three-dimensional object pose estimation; three-dimensional point cloud.

MeSH terms

Computer Graphics*
Imaging, Three-Dimensional*
Neural Networks, Computer*

Grants and funding

This research received no external funding.