Reinforcement learning produces dominant strategies for the Iterated Prisoner's Dilemma

PLoS One. 2017 Dec 11;12(12):e0188046. doi: 10.1371/journal.pone.0188046. eCollection 2017.

Abstract

We present tournament results and several powerful strategies for the Iterated Prisoner's Dilemma created using reinforcement learning techniques (evolutionary and particle swarm algorithms). These strategies are trained to perform well against a corpus of over 170 distinct opponents, including many well-known and classic strategies. All the trained strategies win standard tournaments against the total collection of other opponents. The trained strategies and one particular human made designed strategy are the top performers in noisy tournaments also.

MeSH terms

  • Algorithms
  • Game Theory
  • Humans
  • Learning*
  • Prisoner Dilemma*

Grants and funding

Google Inc. provided support in the form of salaries for author MH, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.