Deep Unbiased Embedding Transfer for Zero-shot Learning

IEEE Trans Image Process. 2019 Oct 22. doi: 10.1109/TIP.2019.2947780. Online ahead of print.

Abstract

Zero-shot learning aims to recognize objects which do not appear in the training dataset. Previous prevalent mapping-based zero-shot learning methods suffer from the projection domain shift problem due to the lack of image classes in the training stage. In order to alleviate the projection domain shift problem, a deep unbiased embedding transfer (DUET) model is proposed in this paper. The DUET model is composed of a deep embedding transfer (DET) module and an unseen visual feature generation (UVG) module. In the DET module, a novel combined embedding transfer net which integrates the complementary merits of the linear and nonlinear embedding mapping functions is proposed to connect the visual space and semantic space. What's more, the end-to-end joint training process is implemented to train the visual feature extractor and the combined embedding transfer net simultaneously. In the UVG module, a visual feature generator trained with a conditional generative adversarial framework is used to synthesize the visual features of the unseen classes to ease the disturbance of the projection domain shift problem. Furthermore, a quantitative index, namely the score of resistance on domain shift (ScoreRDS), is proposed to evaluate different models regarding their resistance capability on the projection domain shift problem. The experiments on five zero-shot learning benchmarks verify the effectiveness of the proposed DUET model. As demonstrated by the qualitative and quantitative analysis, the unseen class visual feature generation, the combined embedding transfer net and the end-to-end joint training process all contribute to alleviating projection domain shift in zero-shot learning.