Multiagent cooperation and competition with deep reinforcement learning

PLoS One. 2017 Apr 5;12(4):e0172395. doi: 10.1371/journal.pone.0172395. eCollection 2017.

Abstract

Evolution of cooperation and competition can appear when multiple adaptive agents share a biological, social, or technological niche. In the present work we study how cooperation and competition emerge between autonomous agents that learn by reinforcement while using only their raw visual input as the state representation. In particular, we extend the Deep Q-Learning framework to multiagent environments to investigate the interaction between two learning agents in the well-known video game Pong. By manipulating the classical rewarding scheme of Pong we show how competitive and collaborative behaviors emerge. We also describe the progression from competitive to collaborative behavior when the incentive to cooperate is increased. Finally we show how learning by playing against another adaptive agent, instead of against a hard-wired algorithm, results in more robust strategies. The present work shows that Deep Q-Networks can become a useful tool for studying decentralized learning of multiagent systems coping with high-dimensional environments.

MeSH terms

  • Algorithms
  • Cooperative Behavior
  • Game Theory
  • Humans
  • Interpersonal Relations
  • Learning / physiology*
  • Reinforcement, Psychology
  • Reward
  • Social Behavior

Grants and funding

All authors gratefully acknowledge the support of NVIDIA Corporation with the donation of one GeForce GTX TITAN X GPU used for this research. RV also thanks the financial support of the Estonian Research Council via the grant PUT438 (https://www.etis.ee/Portal/Projects/Display/e3760907-3178-4863-b7a1-2c2628d6c67a). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.