Efficient Neural Network Compression Inspired by Compressive Sensing

IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):1965-1979. doi: 10.1109/TNNLS.2022.3186008. Epub 2024 Feb 5.

Abstract

Traditional neural network compression (NNC) methods decrease the model size and floating-point operations (FLOPs) in the manner of screening out unimportant weight parameters; however, the intrinsic sparsity characteristics have not been fully exploited. In this article, from the perspective of signal processing and analysis for network parameters, we propose to use a compressive sensing (CS)-based method, namely NNCS, for performance improvements. Our proposed NNCS is inspired by the discovery that sparsity levels of weight parameters in the transform domain are greater than those in the original domain. First, to achieve sparse representations for parameters in the transform domain during training, we incorporate a constrained CS model into loss function. Second, the proposed effective training process consists of two steps, where the first step trains raw weight parameters and induces and reconstructs their sparse representations and the second step trains transform coefficients to improve network performances. Finally, we transform the entire neural network into another new domain-based representation, and a sparser parameter distribution can be obtained to facilitate inference acceleration. Experimental results demonstrate that NNCS can significantly outperform the other existing state-of-the-art methods in terms of parameter reductions and FLOPs. With VGGNet on CIFAR-10, we decrease 94.8% parameters and achieve a 76.8% reduction of FLOPs, with 0.13% drop in Top-1 accuracy. With ResNet-50 on ImageNet, we decrease 75.6% parameters and achieve a 78.9% reduction of FLOPs, with 1.24% drop in Top-1 accuracy.