首页|基于深度强化学习的任务分析方法

基于深度强化学习的任务分析方法

扫码查看
针对任务分析中任务协同交互耦合度高、影响因素繁多等问题,提出了基于序列解耦与深度强化学习的任务分析方法,实现了复杂约束条件下的任务分解及任务序列重构.设计了基于任务信息交互的深度强化学习环境,基于目标网络与评估网络损失函数间的差值改进SumTree算法,实现任务间的优先级评估;将激活函数运行机制引入深度强化学习网络,提取任务特征,提出贪婪激活因子,优化深度神经网络参数,确定智能体最优状态,从而进行智能体状态转换.通过经验回放生成多目标任务执行序列图.仿真实验结果表明,该方法能生成最佳调度下的可执行任务图;且相对于静态情景,该方法对动态情景有较好的自适应性,在领域任务筹划中具有良好的推广应用前景.
Task Analysis Methods Based on Deep Reinforcement Learning
In response to the high coupling of task interaction and many influencing factors in task analysis,a task analysis method based on sequence decoupling and deep reinforcement learning(DRL)is proposed,which can achieve task decomposition and task sequence reconstruction under complex constraints.The method designs an environment for deep reinforcement learning based on task information interaction,while improving the SumTree algorithm based on the difference between the loss functions of the target network and the evaluation network,achieving the priorityevaluation among tasks.The activation function operation mechanism is introduced into the deep reinforcement learning network,followed by extracting the task features,putting forward the greedy activation factor,optimizing the parameters of the deep neural network,and determining the optimal state of the intelligent agent,thus facilitating its state transition.The multi-objective task execution sequence diagram is generated through experience replay.The simulation experiment results show that the method can generate executable task diagrams under optimal scheduling;and it has better adaptivity to dynamic scenarios compared with static scenarios,showing a promising prospect of widespread application in domain task planning.

task analysisreinforcement learningevaluation networkgreedy factorscoupled tasksactivation functions

龚雪、彭鹏菲、荣里、郑雅莲、姜俊

展开 >

海军工程大学,湖北武汉 430033

武汉大学水资源与水电工程科学国家重点实验室,湖北武汉 430072

任务分析 强化学习 评估网络 贪婪因子 耦合任务 激活函数

国家重点研发计划海军工程大学科研发展基金自主立项项目

2017YFC1405205425317S107

2024

系统仿真学报
北京仿真中心 中国系统仿真学会

系统仿真学报

CSTPCD北大核心
影响因子:0.551
ISSN:1004-731X
年,卷(期):2024.36(7)
  • 23