How do organisms learn to do again, on-demand, a behavior that led to a desirable outcome? Dopamine-dependent cortico-striatal plasticity provides a framework for learning behavior's value, but it is less clear how it enables the brain to re-enter desired behaviors and refine them over time. Reinforcing behavior is achieved by re-entering and refining the neural patterns that produce it. We review studies using brain-machine interfaces which reveal that reinforcing cortical population activity requires cortico-basal ganglia circuits. Then, we propose a formal framework for how reinforcement in cortico-basal ganglia circuits acts on the neural dynamics of cortical populations. We propose two parallel mechanisms: i) fast reinforcement which selects the inputs that permit the re-entrance of the particular cortical population dynamics which naturally produced the desired behavior, and ii) slower reinforcement which leads to refinement of cortical population dynamics and more reliable production of neural trajectories driving skillful behavior on-demand.
Copyright © 2019. Published by Elsevier Ltd.