首页|DDPG深度强化学习算法在无人船目标追踪与救援中的应用

DDPG深度强化学习算法在无人船目标追踪与救援中的应用

扫码查看
为保证海上救援活动的高效性,研究结合深度确定性策略梯度算法(Deep Deterministic Policy Gradient,DDPG)从状态空间、动作空间、奖励函数方面对船只追踪救援目标算法进行设计,并实际应用到无人船追踪救援之中.结果显示DDPG算法的稳定成功率接近100%,性能优异.该设计的算法最终回合累积奖励值能够稳定在10 左右,而平均时长则能稳定在 80s左右,能够根据周边环境的状态调整自己的运动策略,满足海上救援活动中的紧迫性要求,能为相关领域的研究提供一条新的思路.
Application of DDPG deep reinforcement learning algorithm in unmanned ship target tracking and rescue
In order to ensure the efficiency of maritime rescue activities,the ship tracking and rescue target algorithm from three aspects:state space,action space and reward function is designed and the unmanned ship tracking and rescue is applied.The results show that the stable success rate of ddpg algorithm is close to 100%and the performance is excellent.The cumulative reward value of the final round of the designed algorithm can be stable at about 10,while the average duration can be stable at about 80 s.It can adjust its movement strategy according to the state of the surrounding environment,meet the urgent requirements in maritime rescue activities,and provide a new idea for research in related fields.

unmanned shiptarget trackingsea rescueDeep Deterministic Policy Gradient(DDPG)

宋雷震、吕东芳

展开 >

淮南联合大学 智能制造学院,安徽 淮南 232001

无人船 目标追踪 海上救援 深度确定性策略梯度算法(DDPG)

淮南联合大学校级自然科学研究项目安徽省自然科学基金重点项目

LZX1902KJ2021A1311

2024

黑龙江大学工程学报
黑龙江大学

黑龙江大学工程学报

影响因子:0.358
ISSN:2095-008X
年,卷(期):2024.15(1)
  • 14