IVS-Caffe-Hardware-Oriented Neural Network Model Development

IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5978-5992. doi: 10.1109/TNNLS.2021.3072145. Epub 2022 Oct 5.

Abstract

This article proposes a hardware-oriented neural network development tool, called Intelligent Vision System Lab (IVS)-Caffe. IVS-Caffe can simulate the hardware behavior of convolution neural network inference calculation. It can quantize weights, input, and output features of convolutional neural network (CNN) and simulate the behavior of multipliers and accumulators calculation to achieve the bit-accurate result. Furthermore, it can test the accuracy of the chosen CNN hardware accelerator. Besides, this article proposes an algorithm to solve the deviation of gradient backpropagation in the bit-accurate quantized multipliers and accumulators. This allows the training of a bit-accurate model and further increases the accuracy of the CNN model at user-designed bit width. The proposed tool takes Faster region based CNN (R-CNN) + Matthew D. Zeiler and Rob Fergus (ZF)-Net, Single Shot MultiBox Detector (SSD) + VGG, SSD + MobileNet, and Tiny you only look once (YOLO) v2 as the experimental models. These models include both one-stage object detection and two-stage object detection models, and base networks include the convolution layer, the fully connected layer, and the modern advanced layers, such as the inception module and depthwise separable convolution. In these experiments, direct quantization of layer-I/O fixed-point models to bit-accurate models will have a 2% mean average precision (mAP) drop of accuracy in the constraint that all layers' accumulators and multipliers are quantized to less or equal to 14 and 12 bit, respectively. After retraining of these quantized models with the proposed IVS-Caffe, we can achieve less than 1% mAP drop in accuracy in the constraint that all layers' accumulators and multipliers are quantized to less or equal to 14 and 11 bit, respectively. With the proposed IVS-Caffe, we can analyze the accuracy of the target model when it is running at hardware accelerators with different bit widths, which is beneficial to fine-tune the target model or customize the hardware accelerators with lower power consumption. Code is available at https://github.com/apple35932003/IVS-Caffe.