Hierarchical Threshold Pruning Based on Uniform Response Criterion

Yaguan Qian; Zhiqiang He; Yuqi Wang; Bin Wang; Xiang Ling; Zhaoquan Gu; Haijiang Wang; Shaoning Zeng; Wassim Swaileh

doi:10.1109/TNNLS.2023.3244994

Hierarchical Threshold Pruning Based on Uniform Response Criterion

IEEE Trans Neural Netw Learn Syst. 2023 Apr 18:PP. doi: 10.1109/TNNLS.2023.3244994. Online ahead of print.

Authors

Yaguan Qian, Zhiqiang He, Yuqi Wang, Bin Wang, Xiang Ling, Zhaoquan Gu, Haijiang Wang, Shaoning Zeng, Wassim Swaileh

PMID: 37071515
DOI: 10.1109/TNNLS.2023.3244994

Abstract

Convolutional neural networks (CNNs) have been successfully applied to various fields. However, CNNs' overparameterization requires more memory and training time, making it unsuitable for some resource-constrained devices. To address this issue, filter pruning as one of the most efficient ways was proposed. In this article, we propose a feature-discrimination-based filter importance criterion, uniform response criterion (URC), as a key component of filter pruning. It converts the maximum activation responses into probabilities and then measures the importance of the filter through the distribution of these probabilities over classes. However, applying URC directly to global threshold pruning may cause some problems. The first problem is that some layers will be completely pruned under global pruning settings. The second problem is that global threshold pruning neglects that filters in different layers have different importance. To address these issues, we propose hierarchical threshold pruning (HTP) with URC. It performs a pruning step limited in a relatively redundant layer rather than comparing the filters' importance across all layers, which can avoid some important filters being pruned. The effectiveness of our method benefits from three techniques: 1) measuring filter importance by URC; 2) normalizing filter scores; and 3) conducting prune in relatively redundant layers. Extensive experiments on CIFAR-10/100 and ImageNet show that our method achieves the state-of-the-art performance on multiple benchmarks.