DyCR: A Dynamic Clustering and Recovering Network for Few-Shot Class-Incremental Learning

Zicheng Pan; Xiaohan Yu; Miaohua Zhang; Weichuan Zhang; Yongsheng Gao

doi:10.1109/TNNLS.2024.3394844

DyCR: A Dynamic Clustering and Recovering Network for Few-Shot Class-Incremental Learning

IEEE Trans Neural Netw Learn Syst. 2024 May 16:PP. doi: 10.1109/TNNLS.2024.3394844. Online ahead of print.

Authors

Zicheng Pan, Xiaohan Yu, Miaohua Zhang, Weichuan Zhang, Yongsheng Gao

PMID: 38753482
DOI: 10.1109/TNNLS.2024.3394844

Abstract

Few-shot class-incremental learning (FSCIL) aims to continually learn novel data with limited samples. One of the major challenges is the catastrophic forgetting problem of old knowledge while training the model on new data. To alleviate this problem, recent state-of-the-art methods adopt a well-trained static network with fixed parameters at incremental learning stages to maintain old knowledge. These methods suffer from the poor adaptation of the old model with new knowledge. In this work, a dynamic clustering and recovering network (DyCR) is proposed to tackle the adaptation problem and effectively mitigate the forgetting phenomena on FSCIL tasks. Unlike static FSCIL methods, the proposed DyCR network is dynamic and trainable during the incremental learning stages, which makes the network capable of learning new features and better adapting to novel data. To address the forgetting problem and improve the model performance, a novel orthogonal decomposition mechanism is developed to split the feature embeddings into context and category information. The context part is preserved and utilized to recover old class features in future incremental learning stages, which can mitigate the forgetting problem with a much smaller size of data than saving the raw exemplars. The category part is used to optimize the feature embedding space by moving different classes of samples far apart and squeezing the sample distances within the same classes during the training stage. Experiments show that the DyCR network outperforms existing methods on four benchmark datasets. The code is available at: https://github.com/zichengpan/DyCR.