基于强化学习的AUV对接控制算法研究
Research on AUV Docking Control Algorithm Based on Reinforcement Learning
庄英豪 1张天泽 2张悦 1李沂滨1
作者信息
- 1. 山东大学海洋研究院,山东 青岛 266000
- 2. 中国石油大学(华东)机电工程学院,山东 青岛 266580
- 折叠
摘要
自主式水下航行器(AUV)是人类探索和利用海洋的重要装备,能否足够智能化地解决路径规划控制问题是AUV完成其它复杂任务的基础.考虑终端姿态约束下的局部路径规划问题,结合AUV的自主对接控制这一实际使用场景,基于改进的深度强化学习算法(DRL)开发了一种对接控制器,使其具备自主对接能力,延长其续航时间.考虑实际工作场景中的复杂海浪干扰因素,使用了非线性扰动观测器(NDO)来估计 AUV 三维运动中各自由度的外部扰动,并结合可测量的状态量为 DRL 智能体设计了科学的观测量及奖励函数,使AUV能够在扰动环境中完成三维对接控制任务.仿真结果表明了该方法的有效性和鲁棒性.
Abstract
Autonomous underwater vehicles(AUVs)is an important kind of equipment for human to explore and utilize the ocean.Intelligent solution of path planning and control is the basis for an AUV to accomplish other complex tasks.Considering the local path planning problem under terminal attitude constraint and combining with AUV autonomous docking control,a docking controller is developed based on the improved Deep Reinforcement Learning(DRL)algorithm.It enables the AUV to dock autonomously and can increase AUV endurance.Considering the complex wave disturbance factors in the practical operating scenario,a nonlinear disturbance observer(NDO)is used to estimate the external disturbances of each degree of freedom in AUV three-dimensional motion.In order to ensure that the AUV can accomplish the three-dimensional docking control task in a disturbed environment,scientific observation quantities and reward functions are designed for the DRL agent in combination with measurable state quantities.Simulation results demonstrate the effectiveness and robustness of the proposed method.
关键词
自主式水下航行器/路径规划/对接控制/强化学习Key words
autonomous underwater vehicle/path planning/docking control/reinforcement learning引用本文复制引用
出版年
2024