基于深度强化学习和知识迁移的飞机装配脉动生产线调度方法

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：飞机装配是飞机制造中的关键环节,如何对飞机装配脉动生产线进行合理调度,实现降本增效,是智能制造领域的重要科学问题.然而,飞机装配脉动生产线场景复杂,装配单架飞机就包含上万道工序,这为飞机装配调度问题的形式化建模和高效求解带来新的挑战,因而当前生产实践中主要依靠人类专家经验进行手工调度.本文聚焦降低人力负载的优化目标,提出两种领域特定的技术以解决飞机装配调度问题.首先,将飞机装配脉动生产线调度问题建模为两个马尔可夫(Markov)决策过程,通过双重强化学习智能体决策生成飞机装配的近似调度方案.其次,针对强化学习决策鲁棒性不足的缺陷,提出领域知识迁移方法,将强化学习的求解知识迁移到整数规划约束剪枝中,最后利用整数规划求解器优化得到综合性能优异的调度方案.在飞机装配生产线的真实数据上完成了实验验证,结果表明本文提出的基于深度强化学习和知识迁移的调度方法能够成功扩展到年产量近百架次的飞机装配脉动生产线调度问题,将组合优化方法难以求解的问题优化到分钟级求解,相较于基线方法取得显著性能优势.

外文标题：Scheduling approach for aircraft assembly pulsation production lines with deep reinforcement learning and knowledge transfer

外文摘要：Aircraft assembly is a critical process in aircraft manufacturing.Scheduling the assembly pulsation production lines of aircraft assembly in a rational manner for cost reduction and efficiency improvement is an important scientific problem in the intelligent manufacturing field.However,the scenario of aircraft assembly lines is complex,with each assembly involving tens of thousands of operations,which poses new challenges for formally modeling and efficiently solving the aircraft assembly scheduling problem.Thereby,current industry practices heavily rely on manual scheduling through the expertise of human professionals.This paper aims to minimize human resource load and proposes two domain-specific techniques to address the scheduling problem of aircraft assembly pulsation lines.Firstly,the scheduling problem of aircraft assembly pulsation production lines is modeled as two Markov decision processes,and a bi-level reinforcement learning agent is used to make decisions on feasible scheduling solutions for aircraft assembly.Secondly,to tackle the problem of robustness deficiency in reinforcement learning decisions,a domain-knowledge transfer paradigm is proposed,whereas the problem-solving knowledge obtained via reinforcement learning is transferred to the constraint pruning process of the integer linear programming model,and the final scheduling solutions with excellent overall performance are attained through an integer programming solver.Experiments are conducted on real scheduling data from aircraft assembly pulsation production lines.Results demonstrate that the proposed scheduling method based on reinforcement learning and knowledge transfer can successfully scale up to scheduling the assembly pulsation production lines with a yield of nearly one hundred aircraft per year,a problem intractable for combinatorial optimization methods.The solving time of the proposed method is reduced to minutes,and the performance exhibits significant advantages compared to baseline methods.

外文关键词：

aircraft assemblyintelligent schedulingcombinatorial optimizationreinforcement learningknowledge transfer

作者：

钟金成、马浩宇、龙明盛、王建民

展开 >

作者单位：

清华大学软件学院,北京 100084

北京信息科学与技术国家研究中心,北京 100084

关键词：

飞机装配智能调度组合优化强化学习知识迁移

基金：

科技创新2030—"新一代人工智能"重大项目国家自然科学基金国家自然科学基金北京市科技新星计划

项目编号：

2020AAA01092016202100262022050Z201100006820041

出版年：

2024

DOI：

10.1360/SSI-2023-0197

中国科学F辑

中国科学院,国家自然科学基金委员会

中国科学F辑

CSTPCD北大核心

影响因子：1.438

ISSN：1674-5973

年,卷(期)：2024.54(6)