A micro-genesis account of longer-form reinforcement learning in structured and unstructured environments

Benjamin James Dyson; Ahad Asad

doi:10.1038/s41539-021-00098-4

A micro-genesis account of longer-form reinforcement learning in structured and unstructured environments

NPJ Sci Learn. 2021 Jun 23;6(1):19. doi: 10.1038/s41539-021-00098-4.

Authors

Benjamin James Dyson^{1

2

3}, Ahad Asad⁴

Affiliations

¹ University of Alberta, Edmonton, AB, Canada. bjdyson@ualberta.ca.
² University of Sussex, Falmer, UK. bjdyson@ualberta.ca.
³ Ryerson University, Toronto, ON, Canada. bjdyson@ualberta.ca.
⁴ University of Alberta, Edmonton, AB, Canada.

Abstract

We explored the possibility that in order for longer-form expressions of reinforcement learning (win-calmness, loss-restlessness) to manifest across tasks, they must first develop because of micro-transactions within tasks. We found no evidence of win-calmness or loss-restlessness when wins could not be maximised (unexploitable opponents), nor when the threat of win minimisation was presented (exploiting opponents), but evidence of win-calmness (but not loss-restlessness) when wins could be maximised (exploitable opponents).