大连海事大学学报2024,Vol.50Issue(1) :11-19.DOI:10.16411/j.cnki.issn1006-7736.2024.01.002

基于深度强化学习的无人驾驶船舶避碰行为决策方法

Collision avoidance behavior decision-making of unmanned ship based on deep reinforcement learning

关巍 罗文哲 崔哲闻
大连海事大学学报2024,Vol.50Issue(1) :11-19.DOI:10.16411/j.cnki.issn1006-7736.2024.01.002

基于深度强化学习的无人驾驶船舶避碰行为决策方法

Collision avoidance behavior decision-making of unmanned ship based on deep reinforcement learning

关巍 1罗文哲 1崔哲闻1
扫码查看

作者信息

  • 1. 大连海事大学航海学院,辽宁大连 116026
  • 折叠

摘要

为解决无人驾驶船舶的多船避碰问题,结合船舶领域知识、国际海上避碰规则(COLREGs)及船舶操纵特性,提出一种基于深度确定性策略梯度(DDPG)算法的多船会遇避碰行为决策方法.采用门控循环单元(GRU)构建神经网络模型,并进行层归一化处理,可有效处理高维观测数据,提高了行为决策的效率.本文设计的奖励函数符合国际海上避碰规则,并考虑了尽量使用小舵角进行避让的船舶操纵习惯.多船会遇的仿真实验验证了本文避碰决策方法在灵活性和有效性方面的优势.

Abstract

To solve the problem of multi-vessel collision avoid-ance of unmanned ships,a multi-vessel collision avoidance behavior decision-making method based on the deep determin-istic policy gradient(DDPG)algorithm was proposed,which combining knowledge of ship domain,international regulations for preventing collisions at sea(COLREGs),and ship ma-neuvering characteristics.The gated recurrent unit(GRU)was used to construct a neural network model and performs layer normalization,which can effectively process high-dimen-sional observation data and improve the efficiency of behavior-al decision-making methods.The reward function designed in this paper conformed to the GOLREGs,while considering the ship maneuvering habit of using small rudder angles as much as possible for avoidance.The simulation experiments of mul-tiple-ship encounters verified the advantages of the collision a-voidance decision-making method in terms of flexibility and effectiveness in this paper.

关键词

多船避碰/行为决策/国际海上避碰规则(COL-REGs)/深度强化学习/门控循环单元(GRU)

Key words

multi-ship collision avoidance/behavioral deci-sion-making/international regulations for preventing collisions at sea(COLREGs)/deep reinforcement learning/gated recur-rent unit(GRU)

引用本文复制引用

基金项目

国家自然科学基金资助项目(52171342)

出版年

2024
大连海事大学学报
大连海事大学

大连海事大学学报

CSTPCD北大核心
影响因子:0.469
ISSN:1006-7736
参考文献量20
段落导航相关论文