首页|基于MADDPG的多AGVs路径规划算法

基于MADDPG的多AGVs路径规划算法

A Path Planning Algorithm for Multiple AGVs Based on MADDPG

扫码查看
针对多辆自动导引车系统(automated guided vehicle system,AGVs)在动态不确定环境下完成货物运送并进行路径规划的问题,提出一种基于多智能体深度确定性策略梯度(MADDPG)的多AGVs路径规划算法.本方法通过状态空间、动作空间、奖励函数和网络结构重新设计MADDPG算法的模型结构,通过OpenAI Gym接口搭建二维仿真环境用作多AGVs(agents)的训练平台.实验结果表明,相比于深度确定性策略梯度(DDPG)算法和双延迟深度确定性策略梯度(TD3)算法,基于MADDPG的多AGVs路径规划算法在智能仓储仿真环境下,多AGVs碰到货架的次数分别减少了21.49%、11.63%,碰到障碍物的次数分别减少了14.69%、10.12%,全部AGVs到达货物装卸点的成功率分别高出了17.22%、10.53%,表明学习后的AGV具有更高效的在线决策能力和自适应能力,能够找到较优的路径.
This paper proposes a path planning algorithm for multiple automated guided vehicle system(AGVs)based on multi-agent deep deterministic policy gradient(MADDPG),which solves the problem of multiple AGVs completing cargo transportation and path planning in dynamic uncertain environment.This method redesigns the frame of the MADDPG algorithm through state space,action space,reward function,and network structure.A two-dimensional simulation environment is built through the OpenAI Gym interface to serve as a training platform for multiple AGVs(multiple agents).The experimental results show that the multiple AGVs path planning algorithm based on MADDPG performs well in the intelligent warehousing simulation environment.Compared to the Deep Deterministic Policy Gradient(DDPG)algorithm and the Twin Delayed Deep Deterministic Policy Gradient(TD3)algorithm,this algorithm reduces the number of touches multiple AGVs make of shelves by 21.49%and 11.63%,respectively,and the number of touches they make of obstacles by 14.69%and 10.12%,respectively.In addition,the success rate of all AGVs reaching the cargo loading and unloading point increases by 17.22%and 10.53%,respectively.Therefore,the learned multiple AGVs have more efficient online decision-making and adaptive abilities,and can thus find better paths.

automated guided vehicle systempath planningmulti-agent deep deterministic policy gradient(MADDPG)algorithmdeep reinforcement learningmultiple agents

尹华一、尤雅丽、黄新栋、段青娜

展开 >

厦门理工学院计算机与信息工程学院,福建 厦门 361024

厦门理工学院光电与通信工程学院,福建 厦门 361024

红云红河烟草(集团)有限责任公司,云南 昆明 650000

自动导引车系统(AGVs) 路径规划 多智能体深度确定性策略梯度(MADDPG)算法 深度强化学习 多智能体

国家自然科学基金国家自然科学基金福建省自然科学基金福建省自然科学基金福建省自然科学基金厦门市科技局高等学校产学研项目

61503316623723922021J0111822021J0111912022J0112752022CXY0401

2024

厦门理工学院学报
厦门理工学院

厦门理工学院学报

影响因子:0.196
ISSN:1673-4432
年,卷(期):2024.32(1)
  • 15