首页|基于深度强化学习的柔性作业车间调度方法

基于深度强化学习的柔性作业车间调度方法

扫码查看
受到车间动态扰动的影响,单一调度规则在车间调度问题中无法一直获得较好的调度结果。对此,本文提出了一种基于D3QN(Dueling double DQN)的调度方法,用于柔性作业车间调度问题。首先通过将调度问题转化为马尔可夫决策过程,构建了强化学习任务数学模型,并依次设计了 18 种生产系统状态特征、9 种用于评价机床和工件的分值动作以及与调度目标相关的奖励函数。然后基于Dueling double DQN算法,在机床Agent、工件Agent与车间生产系统的交互过程中,不断训练两个Agent在每个调度决策时刻选择最高评分的机床和工件,从而完成工件和机床的资源分配任务。最后通过仿真试验,将所提出的方法与直接选取机床编号和选取调度规则的调度方法进行对比,结果表明该方法能取得更好的调度结果。
Flexible Job-Shop Scheduling Method Based on Deep Reinforcement Learning
Affected by the dynamic disturbance of the workshop,a single scheduling rule cannot consistently obtain good scheduling results in the shop scheduling problem.To this end,a scheduling method based on dueling double DQN(D3QN)is proposed in this paper to solve the flexible job-shop scheduling problem.Firstly,by transforming the scheduling problem into Markov decision process,a mathematical model of reinforcement learning task was constructed,and 18 state features of production system,9 scoring actions for evaluating machines and jobs,and reward functions related to scheduling objectives are designed respectively.Then,based on dueling double DQN algorithm,during the interaction of machine agent and job agent and workshop production system,the two agents are continuously trained to select the machine and job with the highest score at each scheduling decision-making time,so as to complete the resource allocation task of jobs and machines.Finally,through simulation experiments,the proposed method is compared with the scheduling method which directly selects the machine tool number and selects the scheduling rules.The results show that this method can obtain better scheduling results.

Deep reinforcement learningFlexible job-shop schedulingNeural networksDeep Q-networkReward function

郭羽、唐敦兵、张泽群

展开 >

南京航空航天大学,南京 210016

深度强化学习 柔性作业车间调度 神经网络 深度Q网络 奖励函数

2024

航空制造技术
北京航空制造工程研究所

航空制造技术

CSTPCD北大核心
影响因子:0.403
ISSN:1671-833X
年,卷(期):2024.67(23)