首页|融合监督学习的小型ROV强化学习算法研究

融合监督学习的小型ROV强化学习算法研究

扫码查看
首先对已有的无人遥控有缆水下机器人(Remotely Operated Vehicle,ROV)运动控制算法进行简要介绍,并对强化学习算法中的深度确定性策略梯度算法(Deep Deterministic Policy Gradient,DDPG)的基本原理进行了阐述;然后针对DDPG算法应用于ROV运动控制时所存在的学习时间长且难以收敛,造成大量的数据浪费而增加了存储开销,降低了神经网络的泛化能力导致实用性降低等问题,提出了基于融合监督学习的监督式DDPG算法;最后进行了仿真实验,结果证明改进型DDPG算法比常规的DDPG算法更加有效。
Research on Small ROV Reinforcement Learning Algorithm Integrating Supervised Learning
Firstly,a brief introduction was given to the existing ROV motion control algorithms,and the basic principles of the DDPG algorithm in deep reinforcement learning algorithms were explained;Then,a supervised DDPG algorithm based on fusion supervised learning is proposed to address the problems of"long learning time and difficulty in convergence,resulting in a large amount of data waste and increased storage overhead,reducing the generalization ability of neural networks and resulting in reduced practicality"when applying the DDPG algorithm to ROV motion control;Finally,simulation experiments were conducted,and the results showed that the improved DDPG algorithm is more effective than the conventional DDPG algorithm.

supervised learningROVreinforcement learning

黄兆军、张彦佳、左晓雯

展开 >

珠海城市职业技术学院,广东 珠海 519090

监督学习 ROV 强化学习

2024

海洋技术学报
国家海洋技术中心

海洋技术学报

CSTPCD
影响因子:0.327
ISSN:1003-2029
年,卷(期):2024.43(6)