An Efficient Sampling-Based Attention Network for Semantic Segmentation

IEEE Trans Image Process. 2022:31:2850-2863. doi: 10.1109/TIP.2022.3162101. Epub 2022 Apr 5.

Abstract

Self-attention is widely explored to model long-range dependencies in semantic segmentation. However, this operation computes pair-wise relationships between the query point and all other points, leading to prohibitive complexity. In this paper, we propose an efficient Sampling-based Attention Network which combines a novel sample method with an attention mechanism for semantic segmentation. Specifically, we design a Stochastic Sampling-based Attention Module (SSAM) to capture the relationships between the query point and a stochastic sampled representative subset from a global perspective, where the sampled subset is selected by a Stochastic Sampling Module. Compared to self-attention, our SSAM achieves comparable segmentation performance while significantly reducing computational redundancy. In addition, with the observation that not all pixels are interested in the contextual information, we design a Deterministic Sampling-based Attention Module (DSAM) to sample features from a local region for obtaining the detailed information. Extensive experiments demonstrate that our proposed method can compete or perform favorably against the state-of-the-art methods on the Cityscapes, ADE20K, COCO Stuff, and PASCAL Context datasets.