Human Co-Parsing Guided Alignment for Occluded Person Re-identification

Shuguang Dou; Cairong Zhao; Xinyang Jiang; Shanshan Zhang; Wei-Shi Zheng; Wangmeng Zuo

doi:10.1109/TIP.2022.3229639

Human Co-Parsing Guided Alignment for Occluded Person Re-identification

IEEE Trans Image Process. 2022 Dec 20:PP. doi: 10.1109/TIP.2022.3229639. Online ahead of print.

Authors

Shuguang Dou, Cairong Zhao, Xinyang Jiang, Shanshan Zhang, Wei-Shi Zheng, Wangmeng Zuo

PMID: 37015433
DOI: 10.1109/TIP.2022.3229639

Abstract

Occluded person re-identification (ReID) is a challenging task due to more background noises and incomplete foreground information. Although existing human parsing-based ReID methods can tackle this problem with semantic alignment at the finest pixel level, their performance is heavily affected by the human parsing model. Most supervised methods propose to train an extra human parsing model aside from the ReID model with cross-domain human parts annotation, suffering from expensive annotation cost and domain gap; Unsupervised methods integrate a feature clustering-based human parsing process into the ReID model, but lacking supervision signals brings less satisfactory segmentation results. In this paper, we argue that the pre-existing information in the ReID training dataset can be directly used as supervision signals to train the human parsing model without any extra annotation. By integrating a weakly supervised human co-parsing network into the ReID network, we propose a novel framework that exploits shared information across different images of the same pedestrian, called the Human Co-parsing Guided Alignment (HCGA) framework. Specifically, the human co-parsing network is weakly supervised by three consistency criteria, namely global semantics, local space, and background. By feeding the semantic information and deep features from the person ReID network into the guided alignment module, features of the foreground and human parts can then be obtained for effective occluded person ReID. Experiment results on two occluded and two holistic datasets demonstrate the superiority of our method. Especially on Occluded-DukeMTMC, it achieves 70.2% Rank-1 accuracy and 57.5% mAP.