Spatial-Frequency Attention Network for Crowd Counting

Xiangyu Guo; Mingliang Gao; Wenzhe Zhai; Jianrun Shang; Qilei Li

doi:10.1089/big.2022.0039

Spatial-Frequency Attention Network for Crowd Counting

Big Data. 2022 Oct;10(5):453-465. doi: 10.1089/big.2022.0039. Epub 2022 Jun 9.

Authors

Xiangyu Guo¹, Mingliang Gao¹, Wenzhe Zhai¹, Jianrun Shang¹, Qilei Li²

Affiliations

¹ School of Electrical and Electronic Engineering, Shandong University of Technology, Zibo, China.
² School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom.

PMID: 35679590
DOI: 10.1089/big.2022.0039

Abstract

Counting the number of people in crowded scenarios is a crucial task in video surveillance and urban security system. Widely deployed surveillance cameras provide big data for training, a compelling deep learning-based counting network. However, large-scale variations in dense crowds are still not entirely solved. To address this problem, we propose a spatial-frequency attention network (SFANet) for crowd counting in this article. A bottleneck spatial attention module is built to emphasize features in various spatial locations and select a region containing individuals adaptively in the spatial domain. As a complementary, in the frequency domain, a multispectral channel attention module is adopted to obtain a more complete set of frequency components for representing each channel. The two attention modules are combined to focus on the discriminative region and suppress the misleading information by their mutual promotion. Experimental results on five benchmark crowd data sets demonstrate that the SFANet can achieve the state-of-the-art performance in terms of accuracy and robustness.

Keywords: convolutional neural network; crowd counting; density estimation; spatial-frequency attention.

Publication types

Research Support, Non-U.S. Gov't