Block-Wise Partner Learning for Model Compression

IEEE Trans Neural Netw Learn Syst. 2023 Sep 1:PP. doi: 10.1109/TNNLS.2023.3306512. Online ahead of print.

Abstract

Despite the great potential of convolutional neural networks (CNNs) in various tasks, the resource-hungry nature greatly hinders their wide deployment in cost-sensitive and low-powered scenarios, especially applications in remote sensing. Existing model pruning approaches, implemented by a "subtraction" operation, impose a performance ceiling on the slimmed model. Self-knowledge distillation (Self-KD) resorts to auxiliary networks that are only active in the training phase for performance improvement. However, the knowledge is holistic and crude, and the learning-based knowledge transfer is mediate and lossy. Here, we propose a novel model-compression method, termed block-wise partner learning (BPL), which comprises "extension" and "fusion" operations and liberates the compressed model from the bondage of baseline. Different from the Self-KD, the proposed BPL creates a partner for each block for performance enhancement in training. For the model to absorb more diverse information, a diversity loss (DL) is designed to evaluate the difference between the original block and the partner. Besides, the partner is fused equivalently instead of being discarded directly. After training, we can simply adopt the fused compressed model that contains the enhancement information of partners but with fewer parameters and less inference cost. As validated using the UC Merced land-use, NWPU-RESISC45, and RSD46-WHU datasets, the BPL demonstrates superiority over other compared model-compression approaches. For example, it attains a substantial floating-point operations (FLOPs) reduction of 73.97% with only 0.24 accuracy (ACC.) loss for ResNet-50 on the UC Merced land-use dataset. The code is available at https://github.com/zhangxin-xd/BPL.