In the optimal control problem with unknown system model parameters,the key to whether the policy iteration can quickly converge to the optimal control policy is the estimation of the value function.In order to improve the estimation accuracy and speed of the value function,this paper proposes a policy iteration optimal control algorithm with adaptive window length adjustment.By making full use of the historical sample data within a period of time,the influence function is used to construct the quantitative relationship between the window length and the estimation performance of the value function,and the window length is adaptively adjusted according to the different influence of the data window length on the estimation performance.Finally,the proposed method is applied to the continuous fermentation process.Simulation results show that the proposed method can accelerate the convergence of the optimal control policy,overcome the influence of parameter changes or external disturbances on the control performance,and improve the control accuracy.
关键词
最优控制/策略迭代/窗口长度自适应调整/影响力函数
Key words
optimal control/policy iteration/adaptive adjustment of window length/influence function