GRPAFusion: A Gradient Residual and Pyramid Attention-Based Multiscale Network for Multimodal Image Fusion

Jinxin Wang; Xiaoli Xi; Dongmei Li; Fang Li; Guanxin Zhang

doi:10.3390/e25010169

GRPAFusion: A Gradient Residual and Pyramid Attention-Based Multiscale Network for Multimodal Image Fusion

Entropy (Basel). 2023 Jan 14;25(1):169. doi: 10.3390/e25010169.

Authors

Jinxin Wang^{1

2}, Xiaoli Xi^{1

2}, Dongmei Li^{1

2}, Fang Li^{1

2}, Guanxin Zhang¹

Affiliations

¹ Optoelectronic System Laboratory, Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China.
² College of Materials Science and Opto-Electronic Technology, University of Chinese Academy of Sciences, Beijing 100049, China.

Abstract

Multimodal image fusion aims to retain valid information from different modalities, remove redundant information to highlight critical targets, and maintain rich texture details in the fused image. However, current image fusion networks only use simple convolutional layers to extract features, ignoring global dependencies and channel contexts. This paper proposes GRPAFusion, a multimodal image fusion framework based on gradient residual and pyramid attention. The framework uses multiscale gradient residual blocks to extract multiscale structural features and multigranularity detail features from the source image. The depth features from different modalities were adaptively corrected for inter-channel responses using a pyramid split attention module to generate high-quality fused images. Experimental results on public datasets indicated that GRPAFusion outperforms the current fusion methods in subjective and objective evaluations.

Keywords: end-to-end model; gradient residual; image fusion; multimodal image; pyramid attention.

Grants and funding

This research received no external funding.