APP-Net: Auxiliary-Point-Based Push and Pull Operations for Efficient Point Cloud Recognition

Tao Lu; Chunxu Liu; Youxin Chen; Gangshan Wu; Limin Wang

doi:10.1109/TIP.2023.3333191

APP-Net: Auxiliary-Point-Based Push and Pull Operations for Efficient Point Cloud Recognition

IEEE Trans Image Process. 2023:32:6500-6513. doi: 10.1109/TIP.2023.3333191. Epub 2023 Dec 1.

Authors

Tao Lu, Chunxu Liu, Youxin Chen, Gangshan Wu, Limin Wang

PMID: 37988214
DOI: 10.1109/TIP.2023.3333191

Abstract

Aggregating neighbor features is essential for point cloud neural network. In the existing work, each point in the cloud may inevitably be selected as the neighbors of multiple aggregation centers, as all centers will gather neighbor features from the whole point cloud independently. Thus, each point has to participate in the calculation repeatedly, generating redundant duplicates in the memory, leading to intensive computation costs and memory consumption. Meanwhile, to pursue higher accuracy, previous methods often rely on a complex local aggregator to extract fine geometric representation, further slowing down the processing pipeline. To address these issues, we propose a new local aggregator of linear complexity for point cloud analysis, coined as APP. Specifically, we introduce an auxiliary container as an anchor to exchange features between the source point and the aggregating center. Each source point pushes its feature to only one auxiliary container, and each center point pulls features from only one auxiliary container. This avoids the re-computation issue of each source point. To facilitate the learning of the local structure of point cloud, we use an online normal estimation module to provide explainable geometric information to enhance our APP modeling capability. Our built network is more efficient than all the previous baselines with a clear margin while still consuming a lower memory. Experiments on classification and semantic segmentation demonstrate that APP-Net reaches comparable accuracies to other networks. In the classification task, it can process more than 10,000 samples per second with less than 10GB of memory on a single GPU. We will release the code at https://github.com/MCG-NJU/ APP-Net.