Actor-critic reinforcement learning in the songbird

Ruidong Chen; Jesse H Goldberg

doi:10.1016/j.conb.2020.08.005

Actor-critic reinforcement learning in the songbird

Curr Opin Neurobiol. 2020 Dec:65:1-9. doi: 10.1016/j.conb.2020.08.005. Epub 2020 Sep 6.

Authors

Ruidong Chen¹, Jesse H Goldberg²

Affiliations

¹ Department of Neurobiology and Behavior, Cornell University, Ithaca, NY 14853, United States.
² Department of Neurobiology and Behavior, Cornell University, Ithaca, NY 14853, United States. Electronic address: jesse.goldberg@cornell.edu.

Abstract

It feels rewarding to ace your opponent on match point. Here, we propose common mechanisms underlie reward and performance learning. First, when a singing bird unexpectedly hits the right note, its dopamine (DA) neurons are activated as when a thirsty monkey receives an unexpected juice reward. Second, these DA signals reinforce vocal variations much as they reinforce stimulus-response associations. Third, limbic inputs to DA neurons signal the predicted quality of song syllables much like they signal the predicted reward value of a place or a stimulus during foraging. Finally, songbirds may solve difficult problems in reinforcement learning - such as credit assignment and catastrophic forgetting - with node perturbation and consolidation of reinforced vocal patterns in motor cortical circuits. Consolidation occurs downstream of a canonical 'actor-critic' circuit motif that learns to maximize performance quality in essentially the same way it learns to maximize reward: by computing and learning from prediction errors.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Review

MeSH terms

Animals
Dopaminergic Neurons
Motor Cortex*
Reinforcement, Psychology
Reward
Songbirds*
Vocalization, Animal

Grants and funding

R01 NS094667/NS/NINDS NIH HHS/United States