DANoC: An Efficient Algorithm and Hardware Codesign of Deep Neural Networks on Chip

IEEE Trans Neural Netw Learn Syst. 2018 Jul;29(7):3176-3187. doi: 10.1109/TNNLS.2017.2717442. Epub 2017 Jul 18.

Abstract

Deep neural networks (NNs) are the state-of-the-art models for understanding the content of images and videos. However, implementing deep NNs in embedded systems is a challenging task, e.g., a typical deep belief network could exhaust gigabytes of memory and result in bandwidth and computational bottlenecks. To address this challenge, this paper presents an algorithm and hardware codesign for efficient deep neural computation. A hardware-oriented deep learning algorithm, named the deep adaptive network, is proposed to explore the sparsity of neural connections. By adaptively removing the majority of neural connections and robustly representing the reserved connections using binary integers, the proposed algorithm could save up to 99.9% memory utility and computational resources without undermining classification accuracy. An efficient sparse-mapping-memory-based hardware architecture is proposed to fully take advantage of the algorithmic optimization. Different from traditional Von Neumann architecture, the deep-adaptive network on chip (DANoC) brings communication and computation in close proximity to avoid power-hungry parameter transfers between on-board memory and on-chip computational units. Experiments over different image classification benchmarks show that the DANoC system achieves competitively high accuracy and efficiency comparing with the state-of-the-art approaches.

Publication types

  • Research Support, Non-U.S. Gov't