Multiscale Feature-Learning with a Unified Model for Hyperspectral Image Classification

Tahir Arshad; Junping Zhang; Inam Ullah; Yazeed Yasin Ghadi; Osama Alfarraj; Amr Gafar

doi:10.3390/s23177628

Multiscale Feature-Learning with a Unified Model for Hyperspectral Image Classification

Sensors (Basel). 2023 Sep 3;23(17):7628. doi: 10.3390/s23177628.

Authors

Tahir Arshad¹, Junping Zhang¹, Inam Ullah², Yazeed Yasin Ghadi³, Osama Alfarraj⁴, Amr Gafar⁵

Affiliations

¹ School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin 150001, China.
² Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea.
³ Department of Computer Science, Al Ain University, Abu Dhabi P.O. Box 112612, United Arab Emirates.
⁴ Computer Science Department, Community College, King Saud University, Riyadh 11437, Saudi Arabia.
⁵ Mathematics and Computer Science Department, Faculty of Science, Menofia University, Shebin Elkom 6131567, Egypt.

Abstract

In the realm of hyperspectral image classification, the pursuit of heightened accuracy and comprehensive feature extraction has led to the formulation of an advance architectural paradigm. This study proposed a model encapsulated within the framework of a unified model, which synergistically leverages the capabilities of three distinct branches: the swin transformer, convolutional neural network, and encoder-decoder. The main objective was to facilitate multiscale feature learning, a pivotal facet in hyperspectral image classification, with each branch specializing in unique facets of multiscale feature extraction. The swin transformer, recognized for its competence in distilling long-range dependencies, captures structural features across different scales; simultaneously, convolutional neural networks undertake localized feature extraction, engendering nuanced spatial information preservation. The encoder-decoder branch undertakes comprehensive analysis and reconstruction, fostering the assimilation of both multiscale spectral and spatial intricacies. To evaluate our approach, we conducted experiments on publicly available datasets and compared the results with state-of-the-art methods. Our proposed model obtains the best classification result compared to others. Specifically, overall accuracies of 96.87%, 98.48%, and 98.62% were obtained on the Xuzhou, Salinas, and LK datasets.

Keywords: convolutional neural network; deep learning models; feature extraction; hyperspectral image classification; multiscale features; swin transformer.

Grants and funding

This work was funded by the Researchers Supporting Project Number (RSP2023R102), King Saud University, Riyadh, Saudi Arabia.