Hyperspectral Image Classification With Multi-Attention Transformer and Adaptive Superpixel Segmentation-Based Active Learning

IEEE Trans Image Process. 2023:32:3606-3621. doi: 10.1109/TIP.2023.3287738. Epub 2023 Jul 3.

Abstract

Deep learning (DL) based methods represented by convolutional neural networks (CNNs) are widely used in hyperspectral image classification (HSIC). Some of these methods have strong ability to extract local information, but the extraction of long-range features is slightly inefficient, while others are just the opposite. For example, limited by the receptive fields, CNN is difficult to capture the contextual spectral-spatial features from a long-range spectral-spatial relationship. Besides, the success of DL-based methods is greatly attributed to numerous labeled samples, whose acquisition are time-consuming and cost-consuming. To resolve these problems, a hyperspectral classification framework based on multi-attention Transformer (MAT) and adaptive superpixel segmentation-based active learning (MAT-ASSAL) is proposed, which successfully achieves excellent classification performance, especially under the condition of small-size samples. Firstly, a multi-attention Transformer network is built for HSIC. Specifically, the self-attention module of Transformer is applied to model long-range contextual dependency between spectral-spatial embedding. Moreover, in order to capture local features, an outlook-attention module which can efficiently encode fine-level features and contexts into tokens is utilized to improve the correlation between the center spectral-spatial embedding and its surroundings. Secondly, aiming to train a excellent MAT model through limited labeled samples, a novel active learning (AL) based on superpixel segmentation is proposed to select important samples for MAT. Finally, to better integrate local spatial similarity into active learning, an adaptive superpixel (SP) segmentation algorithm, which can save SPs in uninformative regions and preserve edge details in complex regions, is employed to generate better local spatial constraints for AL. Quantitative and qualitative results indicate that the MAT-ASSAL outperforms seven state-of-the-art methods on three HSI datasets.