首页|基于深度强化学习的模糊作业车间调度问题

基于深度强化学习的模糊作业车间调度问题

扫码查看
针对具有模糊加工时间和模糊交货期的作业车间调度问题,以最小化最大完工时间为目标,以近端策略优化(PPO)算法为基本优化框架,提出一种LSTM-PPO(proximal policy optimization with Long short-term memory)算法进行求解。首先,设计一种新的状态特征对调度问题进行建模,并且依据建模后的状态特征直接对工件工序进行选取,更加贴近实际环境下的调度决策过程;其次,将长短期记忆(LSTM)网络应用于PPO算法的行动者-评论者框架中,以解决传统模型在问题规模发生变化时难以扩展的问题,使智能体能够在工件、工序、机器数目发生变化时,仍然能够获得最终的调度解。在所选取的模糊作业车间调度的问题集上,通过实验验证了该算法能够取得更好的性能。
Fuzzy job shop scheduling problem based on deep reinforcement learning
For the job shop scheduling problem with fuzzy processing time and fuzzy delivery time,this paper uses the proximal policy optimization(PPO)algorithm as the basic optimization framework with the objective of minimizing the maximum completion time.An LSTM-PPO(proximal policy optimization with long short-term memory)algorithm is proposed to solve the problem.Firstly,a new state feature is designed to model the scheduling problem,and the process is selected directly based on the modeled state feature,which is closer to the actual scheduling decision process.Them,the long short-term memory(LSTM)network is applied to the actor-commentator framework of the PPO algorithm,which solves the problem that the traditional model is difficult to scale up when the problem size changes,and enables the intelligent body to obtain the final scheduling solution even when the number of workpieces,processes,and machines changes.On the selected problem set of fuzzy job shop scheduling,it is experimentally verified that the algorithm can achieve better performance.

deep learningreinforcement learningproximal policy optimizationfuzzy job shop scheduling

朱家政、张宏立、王聪、李新凯、董颖超

展开 >

新疆大学电气工程学院,乌鲁木齐 830047

深度学习 强化学习 近端策略优化算法 模糊作业车间调度

国家自然科学基金项目国家自然科学基金项目

5196701952065064

2024

控制与决策
东北大学

控制与决策

CSTPCD北大核心
影响因子:1.227
ISSN:1001-0920
年,卷(期):2024.39(2)
  • 14