JIMR: Joint Semantic and Geometry Learning for Point Scene Instance Mesh Reconstruction

IEEE Trans Vis Comput Graph. 2024 May 9:PP. doi: 10.1109/TVCG.2024.3398737. Online ahead of print.

Abstract

Point scene instance mesh reconstruction is a challenging task since it requires both scene-level instance segmentation and instance-level mesh reconstruction from partial observations simultaneously. Previous works either adopt a detection backbone or a segmentation one, and then directly employ a mesh reconstruction network to produce complete meshes from incomplete instance point clouds. To further boost the mesh reconstruction quality with both local details and global smoothness, in this work, we propose JIMR, a joint framework with two cascaded stages for semantic and geometry understanding. In the first stage, we propose to perform both instance segmentation and object detection simultaneously. By making both tasks promote each other, this design facilitates subsequent mesh reconstruction by providing more precisely-segmented instance points and better alignment benefiting from predicted complete bounding boxes. In the second stage, we propose a complete-then-reconstruct procedure, where the completion module explicitly disentangles completion from reconstruction, and enables the usage of pre-trained weights of existing powerful completion and reconstruction networks. Moreover, we propose a comprehensive confidence score to filter proposals considering the quality of instance segmentation, bounding box detection, semantic classification, and mesh reconstruction at the same time. Experiments show that our proposed JIMR outperforms state-of-the-art methods regarding instance reconstruction qualitatively and quantitatively.