基于深度强化学习的模糊作业车间调度问题

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对具有模糊加工时间和模糊交货期的作业车间调度问题,以最小化最大完工时间为目标,以近端策略优化(PPO)算法为基本优化框架,提出一种LSTM-PPO(proximal policy optimization with Long short-term memory)算法进行求解.首先,设计一种新的状态特征对调度问题进行建模,并且依据建模后的状态特征直接对工件工序进行选取,更加贴近实际环境下的调度决策过程;其次,将长短期记忆(LSTM)网络应用于PPO算法的行动者-评论者框架中,以解决传统模型在问题规模发生变化时难以扩展的问题,使智能体能够在工件、工序、机器数目发生变化时,仍然能够获得最终的调度解.在所选取的模糊作业车间调度的问题集上,通过实验验证了该算法能够取得更好的性能.

外文标题：Fuzzy job shop scheduling problem based on deep reinforcement learning

外文摘要：For the job shop scheduling problem with fuzzy processing time and fuzzy delivery time,this paper uses the proximal policy optimization(PPO)algorithm as the basic optimization framework with the objective of minimizing the maximum completion time.An LSTM-PPO(proximal policy optimization with long short-term memory)algorithm is proposed to solve the problem.Firstly,a new state feature is designed to model the scheduling problem,and the process is selected directly based on the modeled state feature,which is closer to the actual scheduling decision process.Them,the long short-term memory(LSTM)network is applied to the actor-commentator framework of the PPO algorithm,which solves the problem that the traditional model is difficult to scale up when the problem size changes,and enables the intelligent body to obtain the final scheduling solution even when the number of workpieces,processes,and machines changes.On the selected problem set of fuzzy job shop scheduling,it is experimentally verified that the algorithm can achieve better performance.

外文关键词：

deep learningreinforcement learningproximal policy optimizationfuzzy job shop scheduling

作者：

朱家政、张宏立、王聪、李新凯、董颖超

展开 >

作者单位：

新疆大学电气工程学院,乌鲁木齐 830047

关键词：

深度学习强化学习近端策略优化算法模糊作业车间调度

基金：

国家自然科学基金项目国家自然科学基金项目

项目编号：

5196701952065064

出版年：

2024

DOI：

10.13195/j.kzyjc.2022.1345

控制与决策

东北大学

控制与决策

CSTPCD北大核心

影响因子：1.227

ISSN：1001-0920

年,卷(期)：2024.39(2)

参考文献量14