Model-Induced Generalization Error Bound for Information-Theoretic Representation Learning in Source-Data-Free Unsupervised Domain Adaptation

IEEE Trans Image Process. 2022:31:419-432. doi: 10.1109/TIP.2021.3130530. Epub 2021 Dec 9.

Abstract

Many unsupervised domain adaptation (UDA) methods have been developed and have achieved promising results in various pattern recognition tasks. However, most existing methods assume that raw source data are available in the target domain when transferring knowledge from the source to the target domain. Due to the emerging regulations on data privacy, the availability of source data cannot be guaranteed when applying UDA methods in a new domain. The lack of source data makes UDA more challenging, and most existing methods are no longer applicable. To handle this issue, this paper analyzes the cross-domain representations in source-data-free unsupervised domain adaptation (SF-UDA). A new theorem is derived to bound the target-domain prediction error using the trained source model instead of the source data. On the basis of the proposed theorem, information bottleneck theory is introduced to minimize the generalization upper bound of the target-domain prediction error, thereby achieving domain adaptation. The minimization is implemented in a variational inference framework using a newly developed latent alignment variational autoencoder (LA-VAE). The experimental results show good performance of the proposed method in several cross-dataset classification tasks without using source data. Ablation studies and feature visualization also validate the effectiveness of our method in SF-UDA.