End-to-end novel visual categories learning via auxiliary self-supervision

Yuanyuan Qing; Yijie Zeng; Qi Cao; Guang-Bin Huang

doi:10.1016/j.neunet.2021.02.015

End-to-end novel visual categories learning via auxiliary self-supervision

Neural Netw. 2021 Jul:139:24-32. doi: 10.1016/j.neunet.2021.02.015. Epub 2021 Feb 23.

Authors

Yuanyuan Qing¹, Yijie Zeng², Qi Cao³, Guang-Bin Huang⁴

Affiliations

¹ School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore. Electronic address: qing0006@e.ntu.edu.sg.
² School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore. Electronic address: yzeng004@e.ntu.edu.sg.
³ School of Computing Science, University of Glasgow, Singapore 567739, Singapore. Electronic address: qi.cao@glasgow.ac.uk.
⁴ School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore. Electronic address: egbhuang@ntu.edu.sg.

PMID: 33677376
DOI: 10.1016/j.neunet.2021.02.015

Abstract

Semi-supervised learning has largely alleviated the strong demand for large amount of annotations in deep learning. However, most of the methods have adopted a common assumption that there is always labeled data from the same class of unlabeled data, which is impractical and restricted for real-world applications. In this research work, our focus is on semi-supervised learning when the categories of unlabeled data and labeled data are disjoint from each other. The main challenge is how to effectively leverage knowledge in labeled data to unlabeled data when they are independent from each other, and not belonging to the same categories. Previous state-of-the-art methods have proposed to construct pairwise similarity pseudo labels as supervising signals. However, two issues are commonly inherent in these methods: (1) All of previous methods are comprised of multiple training phases, which makes it difficult to train the model in an end-to-end fashion. (2) Strong dependence on the quality of pairwise similarity pseudo labels limits the performance as pseudo labels are vulnerable to noise and bias. Therefore, we propose to exploit the use of self-supervision as auxiliary task during model training such that labeled data and unlabeled data will share the same set of surrogate labels and overall supervising signals can have strong regularization. By doing so, all modules in the proposed algorithm can be trained simultaneously, which will boost the learning capability as end-to-end learning can be achieved. Moreover, we propose to utilize local structure information in feature space during pairwise pseudo label construction, as local properties are more robust to noise. Extensive experiments have been conducted on three frequently used visual datasets, i.e., CIFAR-10, CIFAR-100 and SVHN, in this paper. Experiment results have indicated the effectiveness of our proposed algorithm as we have achieved new state-of-the-art performance for novel visual categories learning for these three datasets.

Keywords: Image classification; Local data structure; Novel visual categories; Pairwise similarity; Self-supervision.

MeSH terms

Algorithms*
Pattern Recognition, Automated / classification*
Supervised Machine Learning / classification*