Theory meets pigeons: the influence of reward-magnitude on discrimination-learning

Behav Brain Res. 2009 Mar 2;198(1):125-9. doi: 10.1016/j.bbr.2008.10.038. Epub 2008 Nov 8.

Abstract

Modern theoretical accounts on reward-based learning are commonly based on reinforcement learning algorithms. Most noted in this context is the temporal-difference (TD) algorithm in which the difference between predicted and obtained reward, the prediction-error, serves as a learning signal. Consequently, larger rewards cause bigger prediction-errors and lead to faster learning than smaller rewards. Therefore, if animals employ a neural implementation of TD learning, reward-magnitude should affect learning in animals accordingly. Here we test this prediction by training pigeons on a simple color-discrimination task with two pairs of colors. In each pair, correct discrimination is rewarded; in pair one with a large-reward, in pair two with a small-reward. Pigeons acquired the 'large-reward' discrimination faster than the 'small-reward' discrimination. Animal behavior and an implementation of the TD-algorithm yielded comparable results with respect to the difference between learning curves in the large-reward and in the small-reward conditions. We conclude that the influence of reward-magnitude on the acquisition of a simple discrimination paradigm is accurately reflected by a TD implementation of reinforcement learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Behavior, Animal / physiology*
  • Color Perception / physiology
  • Columbidae
  • Conditioning, Classical / physiology
  • Discrimination Learning / physiology*
  • Motivation*
  • Reinforcement Schedule
  • Reinforcement, Psychology*
  • Reward*