Data-driven model reference control of MIMO vertical tank systems with model-free VRFT and Q-Learning

Mircea-Bogdan Radac; Radu-Emil Precup; Raul-Cristian Roman

doi:10.1016/j.isatra.2018.01.014

Data-driven model reference control of MIMO vertical tank systems with model-free VRFT and Q-Learning

ISA Trans. 2018 Feb:73:227-238. doi: 10.1016/j.isatra.2018.01.014. Epub 2018 Jan 8.

Authors

Mircea-Bogdan Radac¹, Radu-Emil Precup², Raul-Cristian Roman³

Affiliations

¹ Department of Automation and Applied Informatics, Politehnica University of Timisoara, Bd. V. Parvan 2, 300223, Timisoara, Romania. Electronic address: mircea.radac@upt.ro.
² Department of Automation and Applied Informatics, Politehnica University of Timisoara, Bd. V. Parvan 2, 300223, Timisoara, Romania. Electronic address: radu.precup@upt.ro.
³ Department of Automation and Applied Informatics, Politehnica University of Timisoara, Bd. V. Parvan 2, 300223, Timisoara, Romania. Electronic address: raul-cristianroman@student.upt.ro.

PMID: 29325777
DOI: 10.1016/j.isatra.2018.01.014

Abstract

This paper proposes a combined Virtual Reference Feedback Tuning-Q-learning model-free control approach, which tunes nonlinear static state feedback controllers to achieve output model reference tracking in an optimal control framework. The novel iterative Batch Fitted Q-learning strategy uses two neural networks to represent the value function (critic) and the controller (actor), and it is referred to as a mixed Virtual Reference Feedback Tuning-Batch Fitted Q-learning approach. Learning convergence of the Q-learning schemes generally depends, among other settings, on the efficient exploration of the state-action space. Handcrafting test signals for efficient exploration is difficult even for input-output stable unknown processes. Virtual Reference Feedback Tuning can ensure an initial stabilizing controller to be learned from few input-output data and it can be next used to collect substantially more input-state data in a controlled mode, in a constrained environment, by compensating the process dynamics. This data is used to learn significantly superior nonlinear state feedback neural networks controllers for model reference tracking, using the proposed Batch Fitted Q-learning iterative tuning strategy, motivating the original combination of the two techniques. The mixed Virtual Reference Feedback Tuning-Batch Fitted Q-learning approach is experimentally validated for water level control of a multi input-multi output nonlinear constrained coupled two-tank system. Discussions on the observed control behavior are offered.

Keywords: Batch fitted Q-learning; Model reference tracking; Model-free optimal control; Multi input-multi output systems; Neural networks; Vertical tank systems; Virtual reference feedback tuning.