首页|基于深度强化学习的机械臂避障轨迹规划研究

基于深度强化学习的机械臂避障轨迹规划研究

扫码查看
针对传统路径规划算法在机械臂避障运动时存在规划时间长、路径冗长等问题,提出了一种基于深度强化学习(Deep Reinforcement Learning,DRL)的运动规划方法.首先,构建了机械臂数学模型和运动环境,并在PyBullet中搭建了DOBOT机械臂与操作环境,设置了DRL所需的奖励函数、动作变量和状态变量等参数.其次,针对静态障碍物规避问题的特点,采用深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法,进行了运动仿真试验.仿真结果表明,相较于快速扩展随机树(Rapid-exploring Random Tree,RRT)算法以及改进RRT算法,所提出的DDPG算法在规划时间和路径长度方面均有一定程度提高.最后,在实验室中采用DOBOT机械臂对DDPG算法在多种障碍物环境下避障操作的有效性进行了验证.
Deep Reinforcement Learning-based Trajectory Planning for Manipulator Obstacle Avoidance
A deep reinforcement learning(DRL)-based motion planning method is proposed to improve long planning elapse and lengthy path of the traditional planning algorithms for robotic manipulator movement in obstacle avoidance.Firstly,based on the mathematical model of the manipulator and the motion environment,the DOBOT robot and the operating environment are built in PyBullet,and the parameters such as the reward function,the action and the state variables required for DRL are set.Secondly,the deep deterministic policy gra-dient(DDPG)algorithm is applied for the characteristics of static obstacle avoidance,and motion simulation ex-periments are conducted.The simulation results show that the proposed DDPG algorithm has a certain degree of improvements in planning elapse and path length compared with the rapid-exploring random tree(RRT)algo-rithm and the improved RRT algorithm.Finally,the effectiveness of the DDPG algorithm in obstacle avoidance operations is tested using the DOBOT robot in a laboratory environment with multiple obstacles.

ManipulatorDeep reinforcement learningObstacle avoidance path planningDeep de-terministic policy gradient algorithm

曹毅、郭银辉、李磊、朱柏宇、赵治华

展开 >

河南工业大学 机电工程学院, 河南 郑州 450001

机械臂 深度强化学习 避障路径规划 深度确定性策略梯度算法

河南省教育厅自然科学基金小麦和玉米深加工国家工程实验室项目

20A413004114100510015

2023

机械传动
郑州机械研究所 中国机械通用零部件工业会齿轮分会 中国机械工程学会

机械传动

CSTPCD北大核心
影响因子:0.534
ISSN:1004-2539
年,卷(期):2023.47(12)
  • 1
  • 5