Feature fusion network based on strip pooling

Gaihua Wang; Qianyu Zhai

doi:10.1038/s41598-021-00585-z

Feature fusion network based on strip pooling

Sci Rep. 2021 Oct 28;11(1):21270. doi: 10.1038/s41598-021-00585-z.

Authors

Gaihua Wang^{1

2}, Qianyu Zhai³

Affiliations

¹ School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan, 430068, China.
² Hubei Key Laboratory for High-Efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, 430068, China.
³ School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan, 430068, China. zhaiqianyu233@163.com.

Abstract

Contextual information is a key factor affecting semantic segmentation. Recently, many methods have tried to use the self-attention mechanism to capture more contextual information. However, these methods with self-attention mechanism need a huge computation. In order to solve this problem, a novel self-attention network, called FFANet, is designed to efficiently capture contextual information, which reduces the amount of calculation through strip pooling and linear layers. It proposes the feature fusion (FF) module to calculate the affinity matrix. The affinity matrix can capture the relationship between pixels. Then we multiply the affinity matrix with the feature map, which can selectively increase the weight of the region of interest. Extensive experiments on the public datasets (PASCAL VOC2012, CityScapes) and remote sensing dataset (DLRSD) have been conducted and achieved Mean Iou score 74.5%, 70.3%, and 63.9% respectively. Compared with the current typical algorithms, the proposed method has achieved excellent performance.

Publication types

Research Support, Non-U.S. Gov't