Single-Image 3-D Reconstruction: Rethinking Point Cloud Deformation

Anh-Duc Nguyen; Seonghwa Choi; Woojae Kim; Jongyoo Kim; Heeseok Oh; Jiwoo Kang; Sanghoon Lee

doi:10.1109/TNNLS.2022.3211929

Single-Image 3-D Reconstruction: Rethinking Point Cloud Deformation

IEEE Trans Neural Netw Learn Syst. 2022 Nov 14:PP. doi: 10.1109/TNNLS.2022.3211929. Online ahead of print.

Authors

Anh-Duc Nguyen, Seonghwa Choi, Woojae Kim, Jongyoo Kim, Heeseok Oh, Jiwoo Kang, Sanghoon Lee

PMID: 36374893
DOI: 10.1109/TNNLS.2022.3211929

Abstract

Single-image 3-D reconstruction has long been a challenging problem. Recent deep learning approaches have been introduced to this 3-D area, but the ability to generate point clouds still remains limited due to inefficient and expensive 3-D representations, the dependency between the output and the number of model parameters, or the lack of a suitable computing operation. In this article, we present a novel deep-learning-based method to reconstruct a point cloud of an object from a single still image. The proposed method can be decomposed into two steps: feature fusion and deformation. The first step extracts both global and point-specific shape features from a 2-D object image, and then injects them into a randomly generated point cloud. In the second step, which is deformation, we introduce a new layer termed as GraphX that considers the interrelationship between points like common graph convolutions but operates on unordered sets. The framework can be applicable to realistic image data with background as we optionally learn a mask branch to segment objects from input images. To complement the quality of point clouds, we further propose an objective function to control the point uniformity. In addition, we introduce different variants of GraphX that cover from best performance to best memory budget. Moreover, the proposed model can generate an arbitrary-sized point cloud, which is the first deep method to do so. Extensive experiments demonstrate that we outperform the existing models and set a new height for different performance metrics in single-image 3-D reconstruction.