Recurrent neural networks that learn multi-step visual routines with reinforcement learning

Sami Mollard; Catherine Wacongne; Sander M Bohte; Pieter R Roelfsema

doi:10.1371/journal.pcbi.1012030

Recurrent neural networks that learn multi-step visual routines with reinforcement learning

PLoS Comput Biol. 2024 Apr 29;20(4):e1012030. doi: 10.1371/journal.pcbi.1012030. eCollection 2024 Apr.

Authors

Sami Mollard¹, Catherine Wacongne^{1

2}, Sander M Bohte^{3

4}, Pieter R Roelfsema^{1

5

6

7}

Affiliations

¹ Department of Vision & Cognition, Netherlands Institute for Neuroscience, Amsterdam, The Netherlands.
² AnotherBrain, Paris, France.
³ Machine Learning Group, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands.
⁴ Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands.
⁵ Laboratory of Visual Brain Therapy, Sorbonne Université, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Institut de la Vision, Paris, France.
⁶ Department of Integrative Neurophysiology, Center for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands.
⁷ Department of Neurosurgery, Academic Medical Center, Amsterdam, The Netherlands.

Abstract

Many cognitive problems can be decomposed into series of subproblems that are solved sequentially by the brain. When subproblems are solved, relevant intermediate results need to be stored by neurons and propagated to the next subproblem, until the overarching goal has been completed. We will here consider visual tasks, which can be decomposed into sequences of elemental visual operations. Experimental evidence suggests that intermediate results of the elemental operations are stored in working memory as an enhancement of neural activity in the visual cortex. The focus of enhanced activity is then available for subsequent operations to act upon. The main question at stake is how the elemental operations and their sequencing can emerge in neural networks that are trained with only rewards, in a reinforcement learning setting. We here propose a new recurrent neural network architecture that can learn composite visual tasks that require the application of successive elemental operations. Specifically, we selected three tasks for which electrophysiological recordings of monkeys' visual cortex are available. To train the networks, we used RELEARNN, a biologically plausible four-factor Hebbian learning rule, which is local both in time and space. We report that networks learn elemental operations, such as contour grouping and visual search, and execute sequences of operations, solely based on the characteristics of the visual stimuli and the reward structure of a task. After training was completed, the activity of the units of the neural network elicited by behaviorally relevant image items was stronger than that elicited by irrelevant ones, just as has been observed in the visual cortex of monkeys solving the same tasks. Relevant information that needed to be exchanged between subroutines was maintained as a focus of enhanced activity and passed on to the subsequent subroutines. Our results demonstrate how a biologically plausible learning rule can train a recurrent neural network on multistep visual tasks.

Copyright: © 2024 Mollard et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Animals
Computational Biology
Learning / physiology
Macaca mulatta
Memory, Short-Term / physiology
Models, Neurological*
Neural Networks, Computer*
Neurons / physiology
Reinforcement, Psychology*
Visual Cortex* / physiology
Visual Perception / physiology

Grants and funding

This research has received funding from the European Union's Horizon 2020 Framework Programme for Research and Innovation under the Specific Grant Agreement No. 945539 (Human Brain Project SGA3, Task 3.7, P.R.R, S.M.B.), Horizon Europe (ERC advanced grant 101052963 "NUMEROUS", P.R.R.), NWO (Crossover grant 17619 “INTENSE” and NWO-OCENW.KLEIN.178, S.M.B.), “DBI2”, a Gravitation program of the Dutch Ministry of Science (S.M.B.), and Agence Nationale de la Recherche (AN) within Programme d’investissement d’avenir, Institut Hospital Universitaire EQReSIGHIT (ANR-18- 590 IAHU-0001, P.R.R.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.