首页|基于改进DQN算法的应召搜潜无人水面艇路径规划方法

基于改进DQN算法的应召搜潜无人水面艇路径规划方法

扫码查看
针对应召反潜中无人水面艇航向和航速机动的情形,提出一种基于改进深度Q学习(Deep Q-learning,DQN)算法的无人艇路径规划方法。结合应召搜潜模型,引入改进的深度强化学习(Improved-DQN,I-DQN)算法,通过联合调整无人水面艇(Unmanned Surface Vessel,USV)的动作空间、动作选择策略和奖励等,获取一条最优路径。算法采用时变动态贪婪策略,根据环境和神经网络的学习效果自适应调整USV动作选择,提高全局搜索能力并避免陷入局部最优解;结合USV所处的障碍物环境和当前位置设置分段非线性奖惩函数,保证不避碰的同时提升算法收敛速度;增加贝塞尔算法对路径平滑处理。仿真结果表明,在相同环境下新方法规划效果优于DQN算法、A*算法和人工势场算法,具有更好的稳定性、收敛性和安全性。
Path Planning Method for Unmanned Surface Vessel in On-call Submarine Search Based on Improved DQN Algorithm
Aiming at the situation that the unmanned surface vessel ( USV) manoeuvrs in the course and speed of on-call anti-submarine,a path planning method for USV based on the improved deep Q-learning ( DQN) algorithm is proposed. The proposed method uses the on-call submarine search model and introduces an improved deep reinforcement learning algorithm to obtain an optimal path by jointly adjusting the action space,action selection strategy and reward of the USV. The algorithm adopts a time-varying dynamic greedy strategy. The strategy can adaptively adjust the USV action selection according to the environment and the learning effect of the neural network,which improves the global search ability and avoids falling into the local optimal solution. The piecewise nonlinear reward and punishment function is set according to the obstacle environment and the current position of the USV so as to improve the convergence speed of the algorithm while avoiding the obstacles. Bezier algorithm is added to smooth the path. The simulated results show that the planning effect of the proposed method is better than DQN algorithm,A* algorithm and APF algorithm in the same environment,and it has better stability,convergence and safety.

unmanned surface vesselpath planningDeep Q-learning algorithmon-call search

牛奕龙、杨仪、张凯、穆莹、王奇、王英民

展开 >

西北工业大学 航海学院,陕西 西安710072

无人水面艇 路径规划 深度Q学习算法 应召搜索

国家自然科学基金项目

51879221

2024

兵工学报
中国兵工学会

兵工学报

CSTPCD北大核心
影响因子:0.735
ISSN:1000-1093
年,卷(期):2024.45(9)
  • 12