Exploration and recency as the main proximate causes of probability matching: a reinforcement learning analysis

Carolina Feher da Silva; Camila Gomes Victorino; Nestor Caticha; Marcus Vinícius Chrysóstomo Baldo

doi:10.1038/s41598-017-15587-z

Exploration and recency as the main proximate causes of probability matching: a reinforcement learning analysis

Sci Rep. 2017 Nov 10;7(1):15326. doi: 10.1038/s41598-017-15587-z.

Authors

Carolina Feher da Silva¹, Camila Gomes Victorino², Nestor Caticha³, Marcus Vinícius Chrysóstomo Baldo⁴

Affiliations

¹ Department of General Physics, Institute of Physics, University of São Paulo, Rua do Matão Nr. 1371, Cidade Universitária, CEP 05508-090, São Paulo, SP, Brazil. carolina.feher.silva@usp.br.
² Department of Physiology and Biophysics, Institute of Biomedical Sciences, University of São Paulo, Av. Prof. Lineu Prestes, 1524, ICB-I, Cidade Universitária, CEP 05508-000, São Paulo, SP, Brazil. camila.victorino@usp.br.
³ Department of General Physics, Institute of Physics, University of São Paulo, Rua do Matão Nr. 1371, Cidade Universitária, CEP 05508-090, São Paulo, SP, Brazil.
⁴ Department of Physiology and Biophysics, Institute of Biomedical Sciences, University of São Paulo, Av. Prof. Lineu Prestes, 1524, ICB-I, Cidade Universitária, CEP 05508-000, São Paulo, SP, Brazil.

Abstract

Research has not yet reached a consensus on why humans match probabilities instead of maximise in a probability learning task. The most influential explanation is that they search for patterns in the random sequence of outcomes. Other explanations, such as expectation matching, are plausible, but do not consider how reinforcement learning shapes people's choices. We aimed to quantify how human performance in a probability learning task is affected by pattern search and reinforcement learning. We collected behavioural data from 84 young adult participants who performed a probability learning task wherein the majority outcome was rewarded with 0.7 probability, and analysed the data using a reinforcement learning model that searches for patterns. Model simulations indicated that pattern search, exploration, recency (discounting early experiences), and forgetting may impair performance. Our analysis estimated that 85% (95% HDI [76, 94]) of participants searched for patterns and believed that each trial outcome depended on one or two previous ones. The estimated impact of pattern search on performance was, however, only 6%, while those of exploration and recency were 19% and 13% respectively. This suggests that probability matching is caused by uncertainty about how outcomes are generated, which leads to pattern search, exploration, and recency.

Publication types

Clinical Trial
Research Support, Non-U.S. Gov't

MeSH terms

Adult
Decision Making / physiology*
Female
Humans
Male
Probability Learning*
Reinforcement, Psychology*