Adversarial Learning for Multiscale Crowd Counting Under Complex Scenes

IEEE Trans Cybern. 2021 Nov;51(11):5423-5432. doi: 10.1109/TCYB.2019.2956091. Epub 2021 Nov 9.

Abstract

In this article, a multiscale generative adversarial network (MS-GAN) is proposed for generating high-quality crowd density maps of arbitrary crowd density scenes. The task of crowd counting has many challenges, such as severe occlusions in extremely dense crowd scenes, perspective distortion, and high visual similarity between the pedestrians and background elements. To address these problems, the proposed MS-GAN combines a multiscale convolutional neural network (generator) and an adversarial network (discriminator) to generate a high-quality density map and accurately estimate the crowd count in complex crowd scenes. The multiscale generator utilizes the fusion features from multiple hierarchical layers to detect people with large-scale variation. The resulting density map produced by the multiscale generator is processed by a discriminator network trained to solve a binary classification task between a poor quality density map and real ground-truth ones. The additional adversarial loss can improve the quality of the density map, which is critical to accurately estimate the crowd counts. The experiments were conducted on multiple datasets with different crowd scenes and densities. The results showed that the proposed method provided better performance compared to current state-of-the-art methods.