Neuro-Inspired Reinforcement Learning to Improve Trajectory Prediction in Reward-Guided Behavior

Bo-Wei Chen; Shih-Hung Yang; Chao-Hung Kuo; Jia-Wei Chen; Yu-Chun Lo; Yun-Ting Kuo; Yi-Chen Lin; Hao-Cheng Chang; Sheng-Huang Lin; Xiao Yu; Boyi Qu; Shuan-Chu Vina Ro; Hsin-Yi Lai; You-Yin Chen

doi:10.1142/S0129065722500381

Neuro-Inspired Reinforcement Learning to Improve Trajectory Prediction in Reward-Guided Behavior

Int J Neural Syst. 2022 Sep;32(9):2250038. doi: 10.1142/S0129065722500381. Epub 2022 Aug 19.

Authors

Bo-Wei Chen¹, Shih-Hung Yang², Chao-Hung Kuo^{1

3

4}, Jia-Wei Chen¹, Yu-Chun Lo⁵, Yun-Ting Kuo¹, Yi-Chen Lin¹, Hao-Cheng Chang¹, Sheng-Huang Lin^{1

6

7}, Xiao Yu^{8

9}, Boyi Qu^{8

9}, Shuan-Chu Vina Ro¹⁰, Hsin-Yi Lai^{8

9

11}, You-Yin Chen^{1

5}

Affiliations

¹ Department of Biomedical Engineering, National Yang Ming Chiao Tung University, No.155, Sec. 2, Linong St., Taipei 11221, Taiwan.
² Department of Mechanical Engineering, National Cheng Kung University, No. 1, University Rd., Tainan 70101, Taiwan.
³ Department of Neurosurgery, Neurological Institute, Taipei Veterans General Hospital, No. 201, Sec. 2, Shipai Rd., Taipei 11217, Taiwan.
⁴ Department of Neurological Surgery, University of Washington, 1959 NE Pacific St., Seattle, WA 98195-6470, U.S.A.
⁵ The Ph.D. Program for Neural Regenerative Medicine, College of Medical Science and Technology, Taipei Medical University, No. 250, Wu-Xing St., Taipei 11031, Taiwan.
⁶ Department of Neurology, Hualien Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, No. 707, Sec. 3, Zhongyang Rd., Hualien 97002, Taiwan.
⁷ Department of Neurology, School of Medicine, Tzu Chi University, No. 701, Sec. 3, Zhongyang Rd., Hualien 97004, Taiwan.
⁸ Department of Neurology of the Second Affiliated, Hospital, Interdisciplinary Institute of Neuroscience and Technology, Key Laboratory of Medical Neurobiology of Zhejiang Province, Zhejiang University School of Medicine, Hangzhou 310029, P. R. China.
⁹ College of Biomedical Engineering and Instrument, Science, Key Laboratory for Biomedical Engineering, of Ministry of Education, Zhejiang University, Hangzhou 310027, P. R. China.
¹⁰ Department of Biomedical Engineering, Johns Hopkins School of Medicine, 720 Rutland Ave., Baltimore, MD 21205, U.S.A.
¹¹ MOE Frontier Science Center for Brain Science and Brain-Machine Integration, School of Brain Science and Brain Medicine, Zhejiang University, Hangzhou 310012, China.

PMID: 35989578
DOI: 10.1142/S0129065722500381

Abstract

Hippocampal pyramidal cells and interneurons play a key role in spatial navigation. In goal-directed behavior associated with rewards, the spatial firing pattern of pyramidal cells is modulated by the animal's moving direction toward a reward, with a dependence on auditory, olfactory, and somatosensory stimuli for head orientation. Additionally, interneurons in the CA1 region of the hippocampus monosynaptically connected to CA1 pyramidal cells are modulated by a complex set of interacting brain regions related to reward and recall. The computational method of reinforcement learning (RL) has been widely used to investigate spatial navigation, which in turn has been increasingly used to study rodent learning associated with the reward. The rewards in RL are used for discovering a desired behavior through the integration of two streams of neural activity: trial-and-error interactions with the external environment to achieve a goal, and the intrinsic motivation primarily driven by brain reward system to accelerate learning. Recognizing the potential benefit of the neural representation of this reward design for novel RL architectures, we propose a RL algorithm based on [Formula: see text]-learning with a perspective on biomimetics (neuro-inspired RL) to decode rodent movement trajectories. The reward function, inspired by the neuronal information processing uncovered in the hippocampus, combines the preferred direction of pyramidal cell firing as the extrinsic reward signal with the coupling between pyramidal cell-interneuron pairs as the intrinsic reward signal. Our experimental results demonstrate that the neuro-inspired RL, with a combined use of extrinsic and intrinsic rewards, outperforms other spatial decoding algorithms, including RL methods that use a single reward function. The new RL algorithm could help accelerate learning convergence rates and improve the prediction accuracy for moving trajectories.

Keywords: Hippocampus; goal-directed behaviors; neural decoding; neuro-inspired reinforcement learning; reward function; spatial navigation.

MeSH terms

Animals
Learning / physiology
Neurons / physiology
Reinforcement, Psychology
Reward*
Spatial Navigation*