Linear quadratic optimal control method based on output feedback inverse reinforcement Q-learning
In this paper,a data-driven output feedback optimal control method using inverse reinforcement Q-learning for linear quadratic optimal control problem of linear discrete-time systems with unknown model parameters and unmea-surable states is proposed.Only input and output data are used to adaptively determine the values of appropriate quadratic performance index weights and optimal control law,so that the system exhibits the same trajectories as the reference tra-jectories.Firstly,an equation for parameter correction is proposed,by combining which with inverse optimal control,a model-based inverse reinforcement learning based optimal control method framework is proposed to compute the cor-rection of the output feedback control law and performance index weights.On this basis,this paper introduces the idea of reinforcement Q-learning and a data-driven output feedback inverse reinforcement Q-learning optimal control method is eventually proposed,which does not require system model parameters,but uses only historical input and output data to solve output feedback control law parameter and performance index weights.The theoretical analysis and simulation experiments are provided to verify the effectiveness of the proposed method.
inverse reinforcement learningQ-learningoutput feedbackdata-driven optimal control