首页|基于增量式Q学习的固定翼无人机跟踪控制性能优化

基于增量式Q学习的固定翼无人机跟踪控制性能优化

扫码查看
针对固定翼无人机纵向控制的高性能需求,提出一种控制系统性能优化结构。该结构包括一个使系统稳定的标称控制器和一个参与性能优化的增量式控制器。控制系统增量式的实现不会改变原有的控制系统,而是仅对标称控制系统做控制输入的补偿与控制性能的优化。基于Q学习理论进行增量式控制器设计,针对状态信息完全可获得的系统,设计一种基于状态反馈的增量式Q学习算法。当状态信息不能完全获得时,利用系统输入、输出和参考信号数据,设计一种基于输出反馈的增量式Q学习算法。两种增量式控制器均是在数据驱动环境下自适应学习增量式控制律,无需提前知道系统动力学模型以及标称控制器的控制增益。此外,证明了增量式Q学习方法在满足持续激励条件的激励噪声下,对Q函数贝尔曼方程的求解没有偏差。最后,通过对F-16飞行器纵向模型实例的仿真验证该方法的有效性。
Performance optimization for tracking control of fixed-wing UAV with incremental Q-learning
Aiming at the high performance requirements of longitudinal control of a fixed-wing unmanned aerial vehicle(UAV),a performance optimization structure of the control system is proposed.This structure includes a nominal controller that stabilizes the system and an incremental controller that participates in performance optimization.The incremental implementation of the control system does not change the original control system,but compensates the control input and optimizes the control performance for the nominal control system exclusively.Based on the Q-learning theory,the incremental controller is designed.For the system with completely available state information,an incremental Q-learning algorithm based on state feedback is developed.When the state information cannot be obtained completely,an incremental Q-learning algorithm based on output feedback is designed by using the system input,output and reference trajectory data.Both incremental controllers learn incremental control laws adaptively in the data-driven environment without the need for system dynamics model and the control gain of the nominal controller.In addition,it is proved that the incremental Q-learning method has no bias in solving the Q-function Bellman equation under the excitation noise.Finally,the effectiveness of the method is verified by the simulation of an example of the longitudinal model of the F-16 aircraft.

reinforcement learningQ-learningincremental controlperformance optimizationtracking controlUAV

赵振根、程磊

展开 >

南京航空航天大学自动化学院,南京 211106

强化学习 Q学习 增量式控制 性能优化 跟踪控制 无人机

国家自然科学基金项目江苏省自然科学基金项目中国博士后科学基金项目

62003161BK201903992021M701701

2024

控制与决策
东北大学

控制与决策

CSTPCD北大核心
影响因子:1.227
ISSN:1001-0920
年,卷(期):2024.39(2)
  • 20