Deep Learning Model Compression With Rank Reduction in Tensor Decomposition

IEEE Trans Neural Netw Learn Syst. 2023 Nov 17:PP. doi: 10.1109/TNNLS.2023.3330542. Online ahead of print.

Abstract

Large neural network models are hard to deploy on lightweight edge devices demanding large network bandwidth. In this article, we propose a novel deep learning (DL) model compression method. Specifically, we present a dual-model training strategy with an iterative and adaptive rank reduction (RR) in tensor decomposition. Our method regularizes the DL models while preserving model accuracy. With adaptive RR, the hyperparameter search space is significantly reduced. We provide a theoretical analysis of the convergence and complexity of the proposed method. Testing our method for the LeNet, VGG, ResNet, EfficientNet, and RevCol over MNIST, CIFAR-10/100, and ImageNet datasets, our method outperforms the baseline compression methods in both model compression and accuracy preservation. The experimental results validate our theoretical findings. For the VGG-16 on CIFAR-10 dataset, our compressed model has shown a 0.88% accuracy gain with 10.41 times storage reduction and 6.29 times speedup. For the ResNet-50 on ImageNet dataset, our compressed model results in 2.36 times storage reduction and 2.17 times speedup. In federated learning (FL) applications, our scheme reduces 13.96 times the communication overhead. In summary, our compressed DL method can improve the image understanding and pattern recognition processes significantly.