Progressive Deep Learning Framework for Recognizing 3D Orientations and Object Class Based on Point Cloud Representation

Sensors (Basel). 2021 Sep 12;21(18):6108. doi: 10.3390/s21186108.

Abstract

Deep learning approaches to estimating full 3D orientations of objects, in addition to object classes, are limited in their accuracies, due to the difficulty in learning the continuous nature of three-axis orientation variations by regression or classification with sufficient generalization. This paper presents a novel progressive deep learning framework, herein referred to as 3D POCO Net, that offers high accuracy in estimating orientations about three rotational axes yet with efficiency in network complexity. The proposed 3D POCO Net is configured, using four PointNet-based networks for independently representing the object class and three individual axes of rotations. The four independent networks are linked by in-between association subnetworks that are trained to progressively map the global features learned by individual networks one after another for fine-tuning the independent networks. In 3D POCO Net, high accuracy is achieved by combining a high precision classification based on a large number of orientation classes with a regression based on a weighted sum of classification outputs, while high efficiency is maintained by a progressive framework by which a large number of orientation classes are grouped into independent networks linked by association subnetworks. We implemented 3D POCO Net for full three-axis orientation variations and trained it with about 146 million orientation variations augmented from the ModelNet10 dataset. The testing results show that we can achieve an orientation regression error of about 2.5° with about 90% accuracy in object classification for general three-axis orientation estimation and object classification. Furthermore, we demonstrate that a pre-trained 3D POCO Net can serve as an orientation representation platform based on which orientations as well as object classes of partial point clouds from occluded objects are learned in the form of transfer learning.

Keywords: 3D object; 3D point cloud; association network; orientation representation; progressive learning.

MeSH terms

  • Deep Learning*