Ensemble model with cascade attention mechanism for high-resolution remote sensing image scene classification

Opt Express. 2020 Jul 20;28(15):22358-22387. doi: 10.1364/OE.395866.

Abstract

Scene classification of high-resolution remote sensing images is a fundamental task of earth observation. And numerous methods have been proposed to achieve this. However, these models are inadequate as the number of labelled training data limits them. Most of the existing methods entirely rely on global information, while regions with class-specific ground objects determine the categories of high-resolution remote sensing images. An ensemble model with a cascade attention mechanism, which consists of two kinds of the convolutional neural network, is proposed to address these issues. To improve the generality of the feature extractor, each branch is trained on different large datasets to enrich the prior knowledge. Moreover, to force the model to focus on the most class-specific region in each high-resolution remote sensing image, a cascade attention mechanism is proposed to combine the branches and capture the most discriminative information. By experiments on four benchmark datasets, OPTIMAL-31, UC Merced Land-Use Dataset, Aerial Image Dataset and NWPU-RESISC45, the proposed end-to-end model cascade attention-based double branches model in this paper achieves state-of-the-art performance on each benchmark dataset.