Confusion Region Mining for Crowd Counting

IEEE Trans Neural Netw Learn Syst. 2023 Sep 15:PP. doi: 10.1109/TNNLS.2023.3311020. Online ahead of print.

Abstract

Existing works mainly focus on crowd and ignore the confusion regions which contain extremely similar appearance to crowd in the background, while crowd counting needs to face these two sides at the same time. To address this issue, we propose a novel end-to-end trainable confusion region discriminating and erasing network called CDENet. Specifically, CDENet is composed of two modules of confusion region mining module (CRM) and guided erasing module (GEM). CRM consists of basic density estimation (BDE) network, confusion region aware bridge and confusion region discriminating network. The BDE network first generates a primary density map, and then the confusion region aware bridge excavates the confusion regions by comparing the primary prediction result with the ground-truth density map. Finally, the confusion region discriminating network learns the difference of feature representations in confusion regions and crowds. Furthermore, GEM gives the refined density map by erasing the confusion regions. We evaluate the proposed method on four crowd counting benchmarks, including ShanghaiTech Part_A, ShanghaiTech Part_B, UCF_CC_50, and UCF-QNRF, and our CDENet achieves superior performance compared with the state-of-the-arts.