Learning-Based Predictive Control for Discrete-Time Nonlinear Systems With Stochastic Disturbances

Xin Xu; Hong Chen; Chuanqiang Lian; Dazi Li

doi:10.1109/TNNLS.2018.2820019

Learning-Based Predictive Control for Discrete-Time Nonlinear Systems With Stochastic Disturbances

IEEE Trans Neural Netw Learn Syst. 2018 Dec;29(12):6202-6213. doi: 10.1109/TNNLS.2018.2820019. Epub 2018 May 9.

Authors

Xin Xu, Hong Chen, Chuanqiang Lian, Dazi Li

PMID: 29993751
DOI: 10.1109/TNNLS.2018.2820019

Abstract

In this paper, a learning-based predictive control (LPC) scheme is proposed for adaptive optimal control of discrete-time nonlinear systems under stochastic disturbances. The proposed LPC scheme is different from conventional model predictive control (MPC), which uses open-loop optimization or simplified closed-loop optimal control techniques in each horizon. In LPC, the control task in each horizon is formulated as a closed-loop nonlinear optimal control problem and a finite-horizon iterative reinforcement learning (RL) algorithm is developed to obtain the closed-loop optimal/suboptimal solutions. Therefore, in LPC, RL and adaptive dynamic programming (ADP) are used as a new class of closed-loop learning-based optimization techniques for nonlinear predictive control with stochastic disturbances. Moreover, LPC also decomposes the infinite-horizon optimal control problem in previous RL and ADP methods into a series of finite horizon problems, so that the computational costs are reduced and the learning efficiency can be improved. Convergence of the finite-horizon iterative RL algorithm in each prediction horizon and the Lyapunov stability of the closed-loop control system are proved. Moreover, by using successive policy updates between adjoint time horizons, LPC also has lower computational costs than conventional MPC which has independent optimization procedures between two different prediction horizons. Simulation results illustrate that compared with conventional nonlinear MPC as well as ADP, the proposed LPC scheme can obtain a better performance both in terms of policy optimality and computational efficiency.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Computer Simulation
Humans
Learning / physiology*
Models, Neurological*
Neural Networks, Computer
Nonlinear Dynamics*
Stochastic Processes*
Time Factors