基于深度强化学习求解作业车间机器与AGV联合调度问题

扫码查看

原文链接

NETL
NSTL
万方数据
维普

中文摘要：针对作业车间中自动引导运输车(automated guided vehicle,AGV)与机器联合调度问题,以完工时间最小化为目标,提出一种基于卷积神经网络和深度强化学习的集成算法框架.首先,对含AGV的作业车间调度析取图进行分析,将问题转化为一个序列决策问题,并将其表述为马尔可夫决策过程.接着,针对问题的求解特点,设计一种基于析取图的空间状态与5个直接状态特征;在动作空间的设置上,设计包含工序选择和AGV指派的二维动作空间;根据作业车间中加工时间与有效运输时间为定值这一特点,构造奖励函数来引导智能体进行学习.最后,设计针对二维动作空间的2D-PPO算法进行训练和学习,以快速响应AGV与机器的联合调度决策.通过实例验证,基于2D-PPO算法的调度算法具有较好的学习性能和可扩展性效果.

外文标题：Deep reinforcement learning for solving the joint scheduling problem of machines and AGVs in job shop

外文摘要：Aiming at the joint scheduling problem of automated guided vehicle(AGV)and machines in the job shop,an integrated algorithm framework based on convolutional neural network and deep reinforcement learning is proposed with the goal of minimizing the completion time.Firstly,the job shop scheduling disjunction graph containing an AGV is analyzed,and the problem is transformed into a sequential decision problem,which is expressed as the Markov decision process.Then,according to the solving characteristics of the problem,a spatial state and five direct state features based on the disjunctive graph are designed.In the setting of the action space,a two-dimensional action space including process selection and AGV assignment is designed.According to the characteristics of fixed value of processing time and effective transportation time in the work workshop,a reward function is constructed to guide the agent to learn.Finally,a 2D-PPO algorithm for two-dimensional action space is designed for training and learning to quickly respond to the joint scheduling decision of the AGV and machine.Through case verification,the scheduling algorithm based on the 2D-PPO algorithm has good learning performance and scalability effect.

外文关键词：

job shop schedulingautomated guided vehicledeep reinforcement learningMarkov decision processproximal policy optimizationjoint scheduling

作者：

孙爱红、雷琦、宋豫川、杨云帆

展开 >

作者单位：

重庆大学机械传动国家重点实验室,重庆 400044

关键词：

作业车间调度自动引导运输车深度强化学习马尔可夫决策过程近端策略优化联合调度

基金：

国家自然科学基金项目

项目编号：

51205429

出版年：

2024

DOI：

10.13195/j.kzyjc.2022.1821

控制与决策

东北大学

控制与决策

CSTPCD北大核心

影响因子：1.227

ISSN：1001-0920

年,卷(期)：2024.39(1)

参考文献量18