Quantifying exploration in reward-based motor learning

Nina M van Mastrigt; Jeroen B J Smeets; Katinka van der Kooij

doi:10.1371/journal.pone.0226789

Quantifying exploration in reward-based motor learning

PLoS One. 2020 Apr 2;15(4):e0226789. doi: 10.1371/journal.pone.0226789. eCollection 2020.

Authors

Nina M van Mastrigt¹, Jeroen B J Smeets¹, Katinka van der Kooij¹

Affiliation

¹ Department of Human Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.

Abstract

Exploration in reward-based motor learning is observable in experimental data as increased variability. In order to quantify exploration, we compare three methods for estimating other sources of variability: sensorimotor noise. We use a task in which participants could receive stochastic binary reward feedback following a target-directed weight shift. Participants first performed six baseline blocks without feedback, and next twenty blocks alternating with and without feedback. Variability was assessed based on trial-to-trial changes in movement endpoint. We estimated sensorimotor noise by the median squared trial-to-trial change in movement endpoint for trials in which no exploration is expected. We identified three types of such trials: trials in baseline blocks, trials in the blocks without feedback, and rewarded trials in the blocks with feedback. We estimated exploration by the median squared trial-to-trial change following non-rewarded trials minus sensorimotor noise. As expected, variability was larger following non-rewarded trials than following rewarded trials. This indicates that our reward-based weight-shifting task successfully induced exploration. Most importantly, our three estimates of sensorimotor noise differed: the estimate based on rewarded trials was significantly lower than the estimates based on the two types of trials without feedback. Consequently, the estimates of exploration also differed. We conclude that the quantification of exploration depends critically on the type of trials used to estimate sensorimotor noise. We recommend the use of variability following rewarded trials.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adolescent
Adult
Behavior / physiology
Biofeedback, Psychology
Female
Humans
Learning / physiology*
Male
Middle Aged
Motor Activity / physiology*
Musculoskeletal Physiological Phenomena
Psychomotor Performance / physiology*
Reaction Time / physiology
Research Design
Reward
Statistical Distributions
Young Adult

Grants and funding

The research was funded by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek, Toegepaste en Technische Wetenschappen (NWO-TTW), by the Open Technologie Programma (OTP) grant 15989 awarded to Jeroen Smeets.