Full Transformer Framework for Robust Point Cloud Registration With Deep Information Interaction

Guangyan Chen; Meiling Wang; Qingxiang Zhang; Li Yuan; Yufeng Yue

doi:10.1109/TNNLS.2023.3267333

Full Transformer Framework for Robust Point Cloud Registration With Deep Information Interaction

IEEE Trans Neural Netw Learn Syst. 2023 May 10:PP. doi: 10.1109/TNNLS.2023.3267333. Online ahead of print.

Authors

Guangyan Chen, Meiling Wang, Qingxiang Zhang, Li Yuan, Yufeng Yue

PMID: 37163402
DOI: 10.1109/TNNLS.2023.3267333

Abstract

Point cloud registration is an essential technology in computer vision and robotics. Recently, transformer-based methods have achieved advanced performance in point cloud registration by utilizing the advantages of the transformer in order-invariance and modeling dependencies to aggregate information. However, they still suffer from indistinct feature extraction, sensitivity to noise, and outliers, owing to three major limitations: 1) the adoption of CNNs fails to model global relations due to their local receptive fields, resulting in extracted features susceptible to noise; 2) the shallow-wide architecture of transformers and the lack of positional information lead to indistinct feature extraction due to inefficient information interaction; and 3) the insufficient consideration of geometrical compatibility leads to the ambiguous identification of incorrect correspondences. To address the above-mentioned limitations, a novel full transformer network for point cloud registration is proposed, named the deep interaction transformer (DIT), which incorporates: 1) a point cloud structure extractor (PSE) to retrieve structural information and model global relations with the local feature integrator (LFI) and transformer encoders; 2) a deep-narrow point feature transformer (PFT) to facilitate deep information interaction across a pair of point clouds with positional information, such that transformers establish comprehensive associations and directly learn the relative position between points; and 3) a geometric matching-based correspondence confidence evaluation (GMCCE) method to measure spatial consistency and estimate correspondence confidence by the designed triangulated descriptor. Extensive experiments on the ModelNet40, ScanObjectNN, and 3DMatch datasets demonstrate that our method is capable of precisely aligning point clouds, consequently, achieving superior performance compared with state-of-the-art methods. The code is publicly available at https://github.com/CGuangyan-BIT/DIT.