Learning Transferable Parameters for Unsupervised Domain Adaptation

IEEE Trans Image Process. 2022:31:6424-6439. doi: 10.1109/TIP.2022.3184848. Epub 2022 Oct 21.

Abstract

Unsupervised domain adaptation (UDA) enables a learning machine to adapt from a labeled source domain to an unlabeled target domain under the distribution shift. Thanks to the strong representation ability of deep neural networks, recent remarkable achievements in UDA resort to learning domain-invariant features. Intuitively, the goal is that a good feature representation and the hypothesis learned from the source domain can generalize well to the target domain. However, the learning processes of domain-invariant features and source hypotheses inevitably involve domain-specific information that would degrade the generalizability of UDA models on the target domain. The lottery ticket hypothesis proves that only partial parameters are essential for generalization. Motivated by it, we find in this paper that only partial parameters are essential for learning domain-invariant information. Such parameters are termed transferable parameters that can generalize well in UDA. In contrast, the rest parameters tend to fit domain-specific details and often cause the failure of generalization, which are termed untransferable parameters. Driven by this insight, we propose Transferable Parameter Learning (TransPar) to reduce the side effect of domain-specific information in the learning process and thus enhance the memorization of domain-invariant information. Specifically, according to the distribution discrepancy degree, we divide all parameters into transferable and untransferable ones in each training iteration. We then perform separate update rules for the two types of parameters. Extensive experiments on image classification and regression tasks (keypoint detection) show that TransPar outperforms prior arts by non-trivial margins. Moreover, experiments demonstrate that TransPar can be integrated into the most popular deep UDA networks and be easily extended to handle any data distribution shift scenarios.

MeSH terms

  • Neural Networks, Computer*