Dynamic sparse coding-based value estimation network for deep reinforcement learning

Haoli Zhao; Zhenni Li; Wensheng Su; Shengli Xie

doi:10.1016/j.neunet.2023.09.013

Dynamic sparse coding-based value estimation network for deep reinforcement learning

Neural Netw. 2023 Nov:168:180-193. doi: 10.1016/j.neunet.2023.09.013. Epub 2023 Sep 11.

Authors

Haoli Zhao¹, Zhenni Li², Wensheng Su³, Shengli Xie⁴

Affiliations

¹ School of Automation, Guangdong University of Technology, Guangzhou, 510006, China; Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangzhou 510006, China. Electronic address: zhaohli1989@hotmail.com.
² School of Automation, Guangdong University of Technology, Guangzhou, 510006, China; 111 Center for Intelligent Batch Manufacturing Based on IoT Technology (GDUT), Guangzhou, 510006, China. Electronic address: lizhenni2012@gmail.com.
³ School of Automation, Guangdong University of Technology, Guangzhou, 510006, China; 111 Center for Intelligent Batch Manufacturing Based on IoT Technology (GDUT), Guangzhou, 510006, China. Electronic address: 2112004077@mail2.gdut.edu.cn.
⁴ Guangdong Key Laboratory of IoT Information Technology (GDUT), Guangzhou, 510006, China; Key Laboratory of Intelligent Information Processing and System Integration of IoT (GDUT), Ministry of Education, Guangzhou, 510006, China. Electronic address: shlxie@gdut.edu.cn.

PMID: 37757726
DOI: 10.1016/j.neunet.2023.09.013

Abstract

Deep Reinforcement Learning (DRL) is one powerful tool for varied control automation problems. Performances of DRL highly depend on the accuracy of value estimation for states from environments. However, the Value Estimation Network (VEN) in DRL can be easily influenced by the phenomenon of catastrophic interference from environments and training. In this paper, we propose a Dynamic Sparse Coding-based (DSC) VEN model to obtain precise sparse representations for accurate value prediction and sparse parameters for efficient training, which is not only applicable in Q-learning structured discrete-action DRL but also in actor-critic structured continuous-action DRL. In detail, to alleviate interference in VEN, we propose to employ DSC to learn sparse representations for accurate value estimation with dynamic gradients beyond the conventional ℓ₁ norm that provides same-value gradients. To avoid influences from redundant parameters, we employ DSC to prune weights with dynamic thresholds more efficiently than static thresholds like ℓ₁ norm. Experiments demonstrate that the proposed algorithms with dynamic sparse coding can obtain higher control performances than existing benchmark DRL algorithms in both discrete-action and continuous-action environments, e.g., over 25% increase in Puddle World and about 10% increase in Hopper. Moreover, the proposed algorithm can reach convergence efficiently with fewer episodes in different environments.

Keywords: Deep reinforcement learning; Dynamic sparse coding; Value estimation network.

MeSH terms

Algorithms
Automation
Benchmarking
Learning*
Reinforcement, Psychology*