首页|基于深度强化学习的多自动导引车运动规划

基于深度强化学习的多自动导引车运动规划

扫码查看
为解决移动机器人仓储系统中的多自动导引车(AGV)无冲突运动规划问题,建立了 Markov决策过程模型,提出一种新的基于深度Q网络(DQN)的求解方法.将AGV的位置作为输入信息,利用DQN估计该状态下采取每个动作所能获得的最大期望累计奖励,并采用经典的深度Q学习算法进行训练.算例计算结果表明,该方法可以有效克服AGV车队在运动中的碰撞问题,使AGV车队能够在无冲突的情况下完成货架搬运任务.与已有启发式算法相比,该方法求得的AGV运动规划方案所需要的平均最大完工时间更短.
Multi-AGV motion planning based on deep reinforcement learning
To solve the problem of multi-Automated Guided Vehicle(AGV)conflict-free motion planning in mobile robot fulfillment systems,a Markov Decision Process(MDP)model was constructed,then a novel planning ap-proach based on Deep Q-Network(DQN)was proposed.With AGVs'positions as inputs,the DQN was trained by using classical deep Q-learning algorithm and was used to estimate the maximum expected cumulative reward re-ceived from taking each action.Computational results of problem instances showed that the proposed approach could effectively overcome the potential collisions of AGV fleet in motion,and thus enabled the AGV fleet to accomplish all rack transportation tasks with conflict-free.Furthermore,compared to an existing planning heuristic in the liter-ature,the motion plans of AGVs generated from the proposed approach requid shorter average makespans.

multi-automated guided vehiclemotion planningMarkov decision processdeep Q-net workdeep Q-learning

孙辉、袁维

展开 >

东南大学机械工程学院,江苏 南京 211189

多自动导引车 运动规划 Markov决策过程 深度Q网络 深度Q学习

2016年智能制造综合标准化资助项目

工信部联装[2016]213号

2024

计算机集成制造系统
中国兵器工业集团第210研究所

计算机集成制造系统

CSTPCD北大核心
影响因子:1.092
ISSN:1006-5911
年,卷(期):2024.30(2)
  • 27