Efficient model learning methods for actor-critic control

IEEE Trans Syst Man Cybern B Cybern. 2012 Jun;42(3):591-602. doi: 10.1109/TSMCB.2011.2170565. Epub 2011 Dec 7.

Abstract

We propose two new actor-critic algorithms for reinforcement learning. Both algorithms use local linear regression (LLR) to learn approximations of the functions involved. A crucial feature of the algorithms is that they also learn a process model, and this, in combination with LLR, provides an efficient policy update for faster learning. The first algorithm uses a novel model-based update rule for the actor parameters. The second algorithm does not use an explicit actor but learns a reference model which represents a desired behavior, from which desired control actions can be calculated using the inverse of the learned process model. The two novel methods and a standard actor-critic algorithm are applied to the pendulum swing-up problem, in which the novel methods achieve faster learning than the standard algorithm.

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Computer Simulation
  • Decision Support Techniques*
  • Models, Theoretical*
  • Pattern Recognition, Automated / methods*