Brain-like emergent auditory learning: A developmental method

Hear Res. 2018 Dec:370:283-293. doi: 10.1016/j.heares.2018.08.010. Epub 2018 Aug 31.

Abstract

Compared with machine audition, the human auditory system can recognize speech accurately and quickly. This paper proposes a new developmental network (DN) that simulates the human auditory system and constructs an artificial auditory model for speech recognition. The new model simulates each key element of the human auditory pathway as a deep network; in particular, an additional layer in the network is considered to simulate the function of the superior colliculus in the thalamus for speech context integration. The mel-frequency cepstral coefficient (MFCC) is used to extract the features of the speech signal as the sensory input of the DN. The emergent feature of DN model provides an explanation of how such internal neurons represent the short speech context when they are not supervised by the external world. The experimental results show that the recognition rates of English words and phrases can be improved significantly compared to those reported in the existing literature. The proposed DN model provides a new method to solve difficult problems, such as universal speech recognition, in traditional machine audition systems. Meanwhile, the same learning principle can potentially be used in or adapted to other computational contexts and applications.

Keywords: Auditory system; Developmental network; Emergent representation; MFCC; Speech recognition.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Auditory Pathways / physiology*
  • Brain / physiology*
  • Computer Simulation*
  • Humans
  • Models, Neurological*
  • Recognition, Psychology*
  • Speech Perception*