Brain-like emergent auditory learning: A developmental method

Dongshu Wang; Hui Shan; Jianbin Xin

doi:10.1016/j.heares.2018.08.010

Brain-like emergent auditory learning: A developmental method

Hear Res. 2018 Dec:370:283-293. doi: 10.1016/j.heares.2018.08.010. Epub 2018 Aug 31.

Authors

Dongshu Wang¹, Hui Shan², Jianbin Xin³

Affiliations

¹ School of Electrical Engineering, Zhengzhou University, No.100, Science Road, Zhengzhou, 450001, PR China. Electronic address: wangdongshu@zzu.edu.cn.
² School of Electrical Engineering, Zhengzhou University, No.100, Science Road, Zhengzhou, 450001, PR China.
³ School of Electrical Engineering, Zhengzhou University, No.100, Science Road, Zhengzhou, 450001, PR China. Electronic address: j.xin@zzu.edu.cn.

PMID: 30193803
DOI: 10.1016/j.heares.2018.08.010

Abstract

Compared with machine audition, the human auditory system can recognize speech accurately and quickly. This paper proposes a new developmental network (DN) that simulates the human auditory system and constructs an artificial auditory model for speech recognition. The new model simulates each key element of the human auditory pathway as a deep network; in particular, an additional layer in the network is considered to simulate the function of the superior colliculus in the thalamus for speech context integration. The mel-frequency cepstral coefficient (MFCC) is used to extract the features of the speech signal as the sensory input of the DN. The emergent feature of DN model provides an explanation of how such internal neurons represent the short speech context when they are not supervised by the external world. The experimental results show that the recognition rates of English words and phrases can be improved significantly compared to those reported in the existing literature. The proposed DN model provides a new method to solve difficult problems, such as universal speech recognition, in traditional machine audition systems. Meanwhile, the same learning principle can potentially be used in or adapted to other computational contexts and applications.

Keywords: Auditory system; Developmental network; Emergent representation; MFCC; Speech recognition.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Auditory Pathways / physiology*
Brain / physiology*
Computer Simulation*
Humans
Models, Neurological*
Recognition, Psychology*
Speech Perception*