DDPG深度强化学习算法在无人船目标追踪与救援中的应用

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：为保证海上救援活动的高效性,研究结合深度确定性策略梯度算法(Deep Deterministic Policy Gradient,DDPG)从状态空间、动作空间、奖励函数方面对船只追踪救援目标算法进行设计,并实际应用到无人船追踪救援之中.结果显示DDPG算法的稳定成功率接近100%,性能优异.该设计的算法最终回合累积奖励值能够稳定在10 左右,而平均时长则能稳定在 80s左右,能够根据周边环境的状态调整自己的运动策略,满足海上救援活动中的紧迫性要求,能为相关领域的研究提供一条新的思路.

外文标题：Application of DDPG deep reinforcement learning algorithm in unmanned ship target tracking and rescue

外文摘要：In order to ensure the efficiency of maritime rescue activities,the ship tracking and rescue target algorithm from three aspects:state space,action space and reward function is designed and the unmanned ship tracking and rescue is applied.The results show that the stable success rate of ddpg algorithm is close to 100%and the performance is excellent.The cumulative reward value of the final round of the designed algorithm can be stable at about 10,while the average duration can be stable at about 80 s.It can adjust its movement strategy according to the state of the surrounding environment,meet the urgent requirements in maritime rescue activities,and provide a new idea for research in related fields.

外文关键词：

unmanned shiptarget trackingsea rescueDeep Deterministic Policy Gradient(DDPG)

作者：

宋雷震、吕东芳

展开 >

作者单位：

淮南联合大学智能制造学院,安徽淮南 232001

关键词：

无人船目标追踪海上救援深度确定性策略梯度算法(DDPG)

基金：

淮南联合大学校级自然科学研究项目安徽省自然科学基金重点项目

项目编号：

LZX1902KJ2021A1311

出版年：

2024

DOI：

10.13524/j.2097-2873.2024.01.007

黑龙江大学工程学报

黑龙江大学

黑龙江大学工程学报

影响因子：0.358

ISSN：2095-008X

年,卷(期)：2024.15(1)

参考文献量14