首页|基于深度强化学习的AUV路径规划研究

基于深度强化学习的AUV路径规划研究

扫码查看
针对三维海洋环境水下自主航行器(AUV)路径规划问题,传统的路径规划算法在三维空间中搜索时间长,对环境的依赖性强,且环境发生改变时,需要重新规划路径,不满足实时性要求.为了使AUV能够自主学习场景并做出决策,提出一种改进的Dueling DQN算法,更改了传统的网络结构以适应AUV路径规划场景.此外,针对路径规划在三维空间中搜寻目标点困难的问题,在原有的优先经验回放池基础上提出了经验蒸馏回放池,使智能体学习失败经验从而提高模型前期的收敛速度和稳定性.仿真实验结果表明:所提出的算法比传统路径规划算法具有更高的实时性,规划路径更短,在收敛速度和稳定性方面都优于标准的DQN算法.
Research on AUV Path Planning Based on Deep Reinforcement Learning
Traditional path planning algorithms for autonomous underwater vehicles(AUV)in 3D marine environments suffer from long search times,strong dependence on environment,and the need for re-planning when environment changes,which fails to meet real-time requirements.To enable AUVs to autonomously learn scenes and make decisions,an improved Dueling Deep Q-Network(DQN)algorithm was proposed,in which the traditional network structure was modified to adapt to AUV path planning scenarios.Addi-tionally,addressing the difficulty of searching for target points in 3D space,an experience distillation replay pool was introduced based on the existing prioritized experience replay pool.This allowed the agent to learn from failure experiences and improved the convergence speed and stability of the model in the early stages.Simulation experimental results demonstrate that the proposed algorithm outperforms traditional path planning algorithms in terms of real-time performance and shorter planned paths.It also surpasses the standard DQN al-gorithm in terms of convergence speed and stability.

autonomous underwater vehicles(AUV)3D path planningdeep reinforcement learningDueling DQN algorithm

房鹏程、周焕银、董玫君

展开 >

东华理工大学机械与电子工程学院,江西南昌 330000

自主水下航行器(AUV) 三维路径规划 深度强化学习 Dueling DQN算法

国家自然科学基金江西省自然科学基金

6206300120224ACB204022

2024

机床与液压
中国机械工程学会 广州机械科学研究院有限公司

机床与液压

CSTPCD北大核心
影响因子:0.32
ISSN:1001-3881
年,卷(期):2024.52(9)