Generative Variational-Contrastive Learning for Self-Supervised Point Cloud Representation

IEEE Trans Pattern Anal Mach Intell. 2024 Mar 19:PP. doi: 10.1109/TPAMI.2024.3378708. Online ahead of print.

Abstract

Self-supervised representation learning for 3D point clouds has attracted increasing attention. However, existing methods in the field of 3D computer vision generally use fixed embeddings to represent the latent features, and impose hard constraints on the embeddings to make the latent feature values of the positive samples converge to consistency, which limits the ability of feature extractors to generalize over different data domains. To address this issue, we propose a Generative Variational-Contrastive Learning (GVC) model, where Gaussian distribution is used to construct a continuous, smoothed representation of the latent features. A distribution constraint and cross-supervision are constructed to improve the transfer ability of the feature extractor over synthetic and real-world data. Specifically, we design a variational contrastive module to constrain the feature distribution instead of feature values corresponding to each sample in the latent space. Moreover, a generative cross-supervision module is introduced to preserve the invariance features and promote the consistency of feature distribution among positive samples. Experimental results demonstrate that GVC achieves SOTA on different downstream tasks. In particular, with only pre-training on the synthetic dataset, GVC achieves a lead of 8.4% and 14.2% when transferring to the real-world dataset in the linear classification and few-shot classification.