首页|基于强化学习的并行任务实时调度方法

基于强化学习的并行任务实时调度方法

扫码查看
针对现有并行任务调度算法大多未考虑环境的不稳定性以及缺少通用性、实时性等问题,提出一种基于强化学习的并行任务实时调度方法。将任务调度建模为一个马尔可夫决策过程,通过智能体与环境的交互,使用近端策略优化方法。其中,使用仿真方法来构造奖励函数,并通过降噪自编码器为优势估计函数添加经验项,使得智能体能够学习到高效且可靠的调度策略。两个场景下的仿真对比实验结果表明,采用该方法比现有方法提升时间利用率超过17%,提高产出超过16%,能够在毫秒级时间内实时调度。
REAL-TIME SCHEDULING OF PARALLEL TASKS BASED ON REINFORCEMENT LEARNING
Existing parallel tasks scheduling algorithms do not consider the environmental instability,adaptability and real-time performance simultaneously.In view of this,we propose a parallel tasks scheduling algorithm based on reinforcement learning.It took the scheduling process as a Markov decision process.Through the interactions between agents and the environment,policies were optimized by the proximal policy optimization method.A simulation method was used to construct the reward function.By adding empirical terms to the advantage estimators by denoising autoencoders,the agents could learn efficient and reliable arrangement policies.The results of simulation experiments in two scenarios show that the proposed method can schedule in milliseconds,and improve the time utilization by more than 17%and the output by more than 16%compared with existing algorithms.

Reinforcement learningParallel taskReal-time scheduleSimulationDenoising autoencoder

王泽远

展开 >

复旦大学计算机科学技术学院 上海 200438

强化学习 并行任务 实时调度 仿真 降噪自编码器

国家电网公司科技项目

SGBJDKOODWJS2000117

2024

计算机应用与软件
上海市计算技术研究所 上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心
影响因子:0.615
ISSN:1000-386X
年,卷(期):2024.41(7)