REDRESS: Generating Compressed Models for Edge Inference Using Tsetlin Machines

Sidharth Maheshwari; Tousif Rahman; Rishad Shafik; Alex Yakovlev; Ashur Rafiev; Lei Jiao; Ole-Christoffer Granmo

doi:10.1109/TPAMI.2023.3268415

REDRESS: Generating Compressed Models for Edge Inference Using Tsetlin Machines

IEEE Trans Pattern Anal Mach Intell. 2023 Sep;45(9):11152-11168. doi: 10.1109/TPAMI.2023.3268415. Epub 2023 Aug 7.

Authors

Sidharth Maheshwari, Tousif Rahman, Rishad Shafik, Alex Yakovlev, Ashur Rafiev, Lei Jiao, Ole-Christoffer Granmo

PMID: 37074898
DOI: 10.1109/TPAMI.2023.3268415

Abstract

Inference at-the-edge using embedded machine learning models is associated with challenging trade-offs between resource metrics, such as energy and memory footprint, and the performance metrics, such as computation time and accuracy. In this work, we go beyond the conventional Neural Network based approaches to explore Tsetlin Machine (TM), an emerging machine learning algorithm, that uses learning automata to create propositional logic for classification. We use algorithm-hardware co-design to propose a novel methodology for training and inference of TM. The methodology, called REDRESS, comprises independent TM training and inference techniques to reduce the memory footprint of the resulting automata to target low and ultra-low power applications. The array of Tsetlin Automata (TA) holds learned information in the binary form as bits: {0,1}, called excludes and includes, respectively. REDRESS proposes a lossless TA compression method, called the include-encoding, that stores only the information associated with includes to achieve over 99% compression. This is enabled by a novel computationally minimal training procedure, called the Tsetlin Automata Re-profiling, to improve the accuracy and increase the sparsity of TA to reduce the number of includes, hence, the memory footprint. Finally, REDRESS includes an inherently bit-parallel inference algorithm that operates on the optimally trained TA in the compressed domain, that does not require decompression during runtime, to obtain high speedups when compared with the state-of-the-art Binary Neural Network (BNN) models. In this work, we demonstrate that using REDRESS approach, TM outperforms BNN models on all design metrics for five benchmark datasets viz. MNIST, CIFAR2, KWS6, Fashion-MNIST and Kuzushiji-MNIST. When implemented on an STM32F746G-DISCO microcontroller, REDRESS obtained speedups and energy savings ranging 5-5700× compared with different BNN models.