A Reinforcement Learning Approach to Understanding Procrastination: Does Inaccurate Value Approximation Cause Irrational Postponing of a Task?

Zheyu Feng; Asako Mitsuto Nagase; Kenji Morita

doi:10.3389/fnins.2021.660595

A Reinforcement Learning Approach to Understanding Procrastination: Does Inaccurate Value Approximation Cause Irrational Postponing of a Task?

Front Neurosci. 2021 Sep 16:15:660595. doi: 10.3389/fnins.2021.660595. eCollection 2021.

Authors

Zheyu Feng¹, Asako Mitsuto Nagase^{1

2

3

4}, Kenji Morita^{1

5}

Affiliations

¹ Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan.
² Division of Neurology, Department of Brain and Neurosciences, Faculty of Medicine, Tottori University, Yonago, Japan.
³ Research Fellowship for Young Scientists, Japan Society for the Promotion of Science, Tokyo, Japan.
⁴ Department of Neurology, Faculty of Medicine, Shimane University, Izumo, Japan.
⁵ International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo, Tokyo, Japan.

Abstract

Procrastination is the voluntary but irrational postponing of a task despite being aware that the delay can lead to worse consequences. It has been extensively studied in psychological field, from contributing factors, to theoretical models. From value-based decision making and reinforcement learning (RL) perspective, procrastination has been suggested to be caused by non-optimal choice resulting from cognitive limitations. Exactly what sort of cognitive limitations are involved, however, remains elusive. In the current study, we examined if a particular type of cognitive limitation, namely, inaccurate valuation resulting from inadequate state representation, would cause procrastination. Recent work has suggested that humans may adopt a particular type of state representation called the successor representation (SR) and that humans can learn to represent states by relatively low-dimensional features. Combining these suggestions, we assumed a dimension-reduced version of SR. We modeled a series of behaviors of a "student" doing assignments during the school term, when putting off doing the assignments (i.e., procrastination) is not allowed, and during the vacation, when whether to procrastinate or not can be freely chosen. We assumed that the "student" had acquired a rigid reduced SR of each state, corresponding to each step in completing an assignment, under the policy without procrastination. The "student" learned the approximated value of each state which was computed as a linear function of features of the states in the rigid reduced SR, through temporal-difference (TD) learning. During the vacation, the "student" made decisions at each time-step whether to procrastinate based on these approximated values. Simulation results showed that the reduced SR-based RL model generated procrastination behavior, which worsened across episodes. According to the values approximated by the "student," to procrastinate was the better choice, whereas not to procrastinate was mostly better according to the true values. Thus, the current model generated procrastination behavior caused by inaccurate value approximation, which resulted from the adoption of the reduced SR as state representation. These findings indicate that the reduced SR, or more generally, the dimension reduction in state representation, can be a potential form of cognitive limitation that leads to procrastination.

Keywords: dimension reduction; procrastination; reinforcement learning; state representation; successor representation; temporal difference learning; value-based decision making.