Large-Scale Image Retrieval with Deep Attentive Global Features

Int J Neural Syst. 2023 Mar;33(3):2350013. doi: 10.1142/S0129065723500132. Epub 2023 Feb 25.

Abstract

How to obtain discriminative features has proved to be a core problem for image retrieval. Many recent works use convolutional neural networks to extract features. However, clutter and occlusion will interfere with the distinguishability of features when using convolutional neural network (CNN) for feature extraction. To address this problem, we intend to obtain high-response activations in the feature map based on the attention mechanism. We propose two attention modules, a spatial attention module and a channel attention module. For the spatial attention module, we first capture the global information and model the relation between channels as a region evaluator, which evaluates and assigns new weights to local features. For the channel attention module, we use a vector with trainable parameters to weight the importance of each feature map. The two attention modules are cascaded to adjust the weight distribution for the feature map, which makes the extracted features more discriminative. Furthermore, we present a scale and mask scheme to scale the major components and filter out the meaningless local features. This scheme can reduce the disadvantages of the various scales of the major components in images by applying multiple scale filters, and filter out the redundant features with the MAX-Mask. Exhaustive experiments demonstrate that the two attention modules are complementary to improve performance, and our network with the three modules outperforms the state-of-the-art methods on four well-known image retrieval datasets.

Keywords: Image retrieval; attention mechanism; convolutional neural network.

MeSH terms

  • Image Processing, Computer-Assisted*
  • Neural Networks, Computer*