Gaze data reveal distinct choice processes underlying model-based and model-free reinforcement learning

Arkady Konovalov; Ian Krajbich

doi:10.1038/ncomms12438

Gaze data reveal distinct choice processes underlying model-based and model-free reinforcement learning

Nat Commun. 2016 Aug 11:7:12438. doi: 10.1038/ncomms12438.

Authors

Arkady Konovalov¹, Ian Krajbich^{1

2}

Affiliations

¹ Department of Economics, The Ohio State University, 1945 North High Street, 410 Arps Hall, Columbus, Ohio 43210, USA.
² Department of Psychology, The Ohio State University, 1827 Neil Avenue, 200E Lazenby Hall, Columbus, Ohio 43210, USA.

Abstract

Organisms appear to learn and make decisions using different strategies known as model-free and model-based learning; the former is mere reinforcement of previously rewarded actions and the latter is a forward-looking strategy that involves evaluation of action-state transition probabilities. Prior work has used neural data to argue that both model-based and model-free learners implement a value comparison process at trial onset, but model-based learners assign more weight to forward-looking computations. Here using eye-tracking, we report evidence for a different interpretation of prior results: model-based subjects make their choices prior to trial onset. In contrast, model-free subjects tend to ignore model-based aspects of the task and instead seem to treat the decision problem as a simple comparison process between two differentially valued items, consistent with previous work on sequential-sampling models of decision making. These findings illustrate a problem with assuming that experimental subjects make their decisions at the same prescribed time.

MeSH terms

Choice Behavior
Decision Making
Eye Movements*
Female
Fixation, Ocular*
Humans
Learning*
Least-Squares Analysis
Male
Models, Neurological
Models, Psychological
Nerve Net
Probability
Regression Analysis
Reinforcement, Psychology*
Reward