Self-Paced Collaborative and Adversarial Network for Unsupervised Domain Adaptation

IEEE Trans Pattern Anal Mach Intell. 2021 Jun;43(6):2047-2061. doi: 10.1109/TPAMI.2019.2962476. Epub 2021 May 11.

Abstract

This paper proposes a new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN), which uses the domain-collaborative and domain-adversarial learning strategies for training the neural network. The domain-collaborative learning strategy aims to learn domain specific feature representation to preserve the discriminability for the target domain, while the domain adversarial learning strategy aims to learn domain invariant feature representation to reduce the domain distribution mismatch between the source and target domains. We show that these two learning strategies can be uniformly formulated as domain classifier learning with positive or negative weights on the losses. We then design a collaborative and adversarial training scheme, which automatically learns domain specific representations from lower blocks in CNNs through collaborative learning and domain invariant representations from higher blocks through adversarial learning. Moreover, to further enhance the discriminability in the target domain, we propose Self-Paced CAN (SPCAN), which progressively selects pseudo-labeled target samples for re-training the classifiers. We employ a self-paced learning strategy such that we can select pseudo-labeled target samples in an easy-to-hard fashion. Additionally, we build upon the popular two-stream approach to extend our domain adaptation approach for more challenging video action recognition task, which additionally considers the cooperation between the RGB stream and the optical flow stream. We propose the Two-stream SPCAN (TS-SPCAN) method to select and reweight the pseudo labeled target samples of one stream (RGB/Flow) based on the information from the other stream (Flow/RGB) in a cooperative way. As a result, our TS-SPCAN model is able to exchange the information between the two streams. Comprehensive experiments on different benchmark datasets, Office-31, ImageCLEF-DA and VISDA-2017 for the object recognition task, and UCF101-10 and HMDB51-10 for the video action recognition task, show our newly proposed approaches achieve the state-of-the-art performance, which clearly demonstrates the effectiveness of our proposed approaches for unsupervised domain adaptation.

Publication types

  • Research Support, Non-U.S. Gov't