CACNN: Capsule Attention Convolutional Neural Networks for 3D Object Recognition

IEEE Trans Neural Netw Learn Syst. 2023 Nov 7:PP. doi: 10.1109/TNNLS.2023.3326606. Online ahead of print.

Abstract

Recently, view-based approaches, which recognize a 3D object through its projected 2-D images, have been extensively studied and have achieved considerable success in 3D object recognition. Nevertheless, most of them use a pooling operation to aggregate viewwise features, which usually leads to the visual information loss. To tackle this problem, we propose a novel layer called capsule attention layer (CAL) by using attention mechanism to fuse the features expressed by capsules. In detail, instead of dynamic routing algorithm, we use an attention module to transmit information from the lower level capsules to higher level capsules, which obviously improves the speed of capsule networks. In particular, the view pooling layer of multiview convolutional neural network (MVCNN) becomes a special case of our CAL when the trainable weights are chosen on some certain values. Furthermore, based on CAL, we propose a capsule attention convolutional neural network (CACNN) for 3D object recognition. Extensive experimental results on three benchmark datasets demonstrate the efficiency of our CACNN and show that it outperforms many state-of-the-art methods.