Learning to Produce Syllabic Speech Sounds via Reward-Modulated Neural Plasticity

PLoS One. 2016 Jan 25;11(1):e0145096. doi: 10.1371/journal.pone.0145096. eCollection 2016.

Abstract

At around 7 months of age, human infants begin to reliably produce well-formed syllables containing both consonants and vowels, a behavior called canonical babbling. Over subsequent months, the frequency of canonical babbling continues to increase. How the infant's nervous system supports the acquisition of this ability is unknown. Here we present a computational model that combines a spiking neural network, reinforcement-modulated spike-timing-dependent plasticity, and a human-like vocal tract to simulate the acquisition of canonical babbling. Like human infants, the model's frequency of canonical babbling gradually increases. The model is rewarded when it produces a sound that is more auditorily salient than sounds it has previously produced. This is consistent with data from human infants indicating that contingent adult responses shape infant behavior and with data from deaf and tracheostomized infants indicating that hearing, including hearing one's own vocalizations, is critical for canonical babbling development. Reward receipt increases the level of dopamine in the neural network. The neural network contains a reservoir with recurrent connections and two motor neuron groups, one agonist and one antagonist, which control the masseter and orbicularis oris muscles, promoting or inhibiting mouth closure. The model learns to increase the number of salient, syllabic sounds it produces by adjusting the base level of muscle activation and increasing their range of activity. Our results support the possibility that through dopamine-modulated spike-timing-dependent plasticity, the motor cortex learns to harness its natural oscillations in activity in order to produce syllabic sounds. It thus suggests that learning to produce rhythmic mouth movements for speech production may be supported by general cortical learning mechanisms. The model makes several testable predictions and has implications for our understanding not only of how syllabic vocalizations develop in infancy but also for our understanding of how they may have evolved.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Auditory Perception
  • Computer Simulation*
  • Dopamine / physiology
  • Feedback
  • Humans
  • Infant
  • Jaw / physiology
  • Language Development*
  • Laryngeal Muscles / physiology
  • Lip / physiology
  • Machine Learning*
  • Models, Neurological*
  • Motor Cortex / physiology
  • Motor Neurons / physiology
  • Nerve Net / physiology
  • Neural Networks, Computer*
  • Neuronal Plasticity*
  • Phonetics*
  • Reward

Substances

  • Dopamine

Grants and funding

MKF’s efforts were funded by the University of California, Merced Undergraduate Research in Computational Biology Program, sponsored by National Science Foundation Grant DBI-1040962 (http://www.nsf.gov/awardsearch/showAward?AWD_ID=1040962).