A Comparison of Low-Complexity Real-Time Feature Extraction for Neuromorphic Speech Recognition

Jyotibdha Acharya; Aakash Patil; Xiaoya Li; Yi Chen; Shih-Chii Liu; Arindam Basu

doi:10.3389/fnins.2018.00160

A Comparison of Low-Complexity Real-Time Feature Extraction for Neuromorphic Speech Recognition

Front Neurosci. 2018 Mar 28:12:160. doi: 10.3389/fnins.2018.00160. eCollection 2018.

Authors

Jyotibdha Acharya¹, Aakash Patil², Xiaoya Li³, Yi Chen², Shih-Chii Liu³, Arindam Basu²

Affiliations

¹ HealthTech NTU, Interdisciplinary Graduate School, Nanyang Technological University, Singapore, Singapore.
² School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore.
³ Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland.

Abstract

This paper presents a real-time, low-complexity neuromorphic speech recognition system using a spiking silicon cochlea, a feature extraction module and a population encoding method based Neural Engineering Framework (NEF)/Extreme Learning Machine (ELM) classifier IC. Several feature extraction methods with varying memory and computational complexity are presented along with their corresponding classification accuracies. On the N-TIDIGITS18 dataset, we show that a fixed bin size based feature extraction method that votes across both time and spike count features can achieve an accuracy of 95% in software similar to previously report methods that use fixed number of bins per sample while using ~3× less energy and ~25× less memory for feature extraction (~1.5× less overall). Hardware measurements for the same topology show a slightly reduced accuracy of 94% that can be attributed to the extra correlations in hardware random weights. The hardware accuracy can be increased by further increasing the number of hidden nodes in ELM at the cost of memory and energy.

Keywords: extreme learning machine; neural engineering framework; neuromorphic; real-time; silicon cochlea.