针对海上船舶自主避碰决策中深度Q网络(deep Q-network,DQN)算法的高估和收敛性差的问题,提出一种融合噪声网络的裁剪双DQN(double DQN,DDQN)算法,记为NoisyNet-CDDQN算法.该算法采用裁剪双Q值的方式减小DQN算法的高估问题,并通过引入噪声网络来增强算法的稳定性以解决DQN算法收敛性差的问题.充分考虑船舶运动数学模型和船舶领域模型,并在奖励函数设计中考虑到偏航、《国际海上避碰规则》(International Regulations for Preventing Collisions at Sea,COLREGs)等要素.多会遇场景仿真实验证明,本文所提出的NoisyNet-CDDQN算法相较于融合噪声网络的DQN算法在收敛速度上提升了 27.27%,相较于DDQN算法提升了 54.55%,相较于DQN算法提升了 87.27%,并且船舶自主避碰决策行为符合COLREGs,可为船舶的自主避碰提供参考.
An autonomous collision avoidance decision algorithm for ships based on clipped DDQN with noise network
To address the problems of overestimation and poor convergence of the deep Q-network(DQN)algorithm for autonomous collision avoidance decision for ships,an algorithm based on the clipped double DQN(DDQN)with the noise network,denoted by NoisyNet-CDDQN algorithm,is proposed.The proposed algorithm mitigates the overestimation problem of DQN algorithm by clipping the double Q values and introduce the noise network to enhance its stability to address the problem of poor convergence of DQN algorithm.The mathematical model of ship movement and the ship domain model are fully considered,and the factors such as yaw and International Regulations for Preventing Collisions at Sea(COLREGs)during the design of the reward function are considered.The simulation experiments on multi-encounter scenarios conclusively demonstrate that:the proposed NoisyNet-CDDQN algorithm exhibits 27.27%improvement in the convergence speed compared to DQN algorithm with the noise network,54.55%improvement compared to DDQN algorithm,and 87.27%improvement compared to DQN algorithm;the autonomous collision avoidance decision behaviors supported by NoisyNet-CDDQN algorithm comply with COLREGs,so it can provide reference for autonomous collision avoidance of ships.
noise networkdouble deep Q-network(DDQN)ship autonomous collision avoidanceinternational regulations for preventing collisions at sea