基于MADDPG的多AGVs路径规划算法

A Path Planning Algorithm for Multiple AGVs Based on MADDPG

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对多辆自动导引车系统(automated guided vehicle system,AGVs)在动态不确定环境下完成货物运送并进行路径规划的问题,提出一种基于多智能体深度确定性策略梯度(MADDPG)的多AGVs路径规划算法.本方法通过状态空间、动作空间、奖励函数和网络结构重新设计MADDPG算法的模型结构,通过OpenAI Gym接口搭建二维仿真环境用作多AGVs(agents)的训练平台.实验结果表明,相比于深度确定性策略梯度(DDPG)算法和双延迟深度确定性策略梯度(TD3)算法,基于MADDPG的多AGVs路径规划算法在智能仓储仿真环境下,多AGVs碰到货架的次数分别减少了21.49%、11.63%,碰到障碍物的次数分别减少了14.69%、10.12%,全部AGVs到达货物装卸点的成功率分别高出了17.22%、10.53%,表明学习后的AGV具有更高效的在线决策能力和自适应能力,能够找到较优的路径.

外文摘要：This paper proposes a path planning algorithm for multiple automated guided vehicle system(AGVs)based on multi-agent deep deterministic policy gradient(MADDPG),which solves the problem of multiple AGVs completing cargo transportation and path planning in dynamic uncertain environment.This method redesigns the frame of the MADDPG algorithm through state space,action space,reward function,and network structure.A two-dimensional simulation environment is built through the OpenAI Gym interface to serve as a training platform for multiple AGVs(multiple agents).The experimental results show that the multiple AGVs path planning algorithm based on MADDPG performs well in the intelligent warehousing simulation environment.Compared to the Deep Deterministic Policy Gradient(DDPG)algorithm and the Twin Delayed Deep Deterministic Policy Gradient(TD3)algorithm,this algorithm reduces the number of touches multiple AGVs make of shelves by 21.49%and 11.63%,respectively,and the number of touches they make of obstacles by 14.69%and 10.12%,respectively.In addition,the success rate of all AGVs reaching the cargo loading and unloading point increases by 17.22%and 10.53%,respectively.Therefore,the learned multiple AGVs have more efficient online decision-making and adaptive abilities,and can thus find better paths.

外文关键词：

automated guided vehicle systempath planningmulti-agent deep deterministic policy gradient(MADDPG)algorithmdeep reinforcement learningmultiple agents

作者：

尹华一、尤雅丽、黄新栋、段青娜

展开 >

作者单位：

厦门理工学院计算机与信息工程学院,福建厦门 361024

厦门理工学院光电与通信工程学院,福建厦门 361024

红云红河烟草(集团)有限责任公司,云南昆明 650000

关键词：

自动导引车系统(AGVs) 路径规划多智能体深度确定性策略梯度(MADDPG)算法深度强化学习多智能体

基金：

国家自然科学基金国家自然科学基金福建省自然科学基金福建省自然科学基金福建省自然科学基金厦门市科技局高等学校产学研项目

项目编号：

61503316623723922021J0111822021J0111912022J0112752022CXY0401

出版年：

2024

DOI：

10.19697/j.cnki.1673-4432.202401006

厦门理工学院学报

厦门理工学院

厦门理工学院学报

影响因子：0.196

ISSN：1673-4432

年,卷(期)：2024.32(1)

参考文献量15