湖南大学学报(自然科学版)2024,Vol.51Issue(6) :73-85.DOI:10.16339/j.cnki.hdxbzkb.2024268

D3DQN-CAA:一种基于DRL的自适应边缘计算任务调度方法

D3DQN-CAA:a DRL-based Adaptive Edge Computing Task Scheduling Method

巨涛 王志强 刘帅 火久元 李启南
湖南大学学报(自然科学版)2024,Vol.51Issue(6) :73-85.DOI:10.16339/j.cnki.hdxbzkb.2024268

D3DQN-CAA:一种基于DRL的自适应边缘计算任务调度方法

D3DQN-CAA:a DRL-based Adaptive Edge Computing Task Scheduling Method

巨涛 1王志强 1刘帅 1火久元 1李启南1
扫码查看

作者信息

  • 1. 兰州交通大学 电子与信息工程学院,甘肃 兰州 730070
  • 折叠

摘要

为解决已有基于深度强化学习的边缘计算任务调度面临的动作空间探索度固定不变、样本效率低、内存需求量大、稳定性差等问题,更好地在计算资源相对有限的边缘计算系统中进行有效的任务调度,在改进深度强化学习模型D3DQN(Dueling Double DQN)的基础上,提出了自适应边缘计算任务调度方法D3DQN-CAA.在任务卸载决策时,将任务与处理器的对应关系看作一个多维背包问题,根据当前调度任务与计算节点的状态信息,为任务选择与其匹配度最高的计算节点进行任务处理;为提高评估网络的参数更新效率,降低过估计的影响,提出一种综合性Q值计算方法;为进一步加快神经网络的收敛速度,提出了一种自适应动作空间动态探索度调整策略;为减少系统所需的存储资源,提高样本效率,提出一种自适应轻量优先级回放机制.实验结果表明,和多种基准算法相比,D3DQN-CAA方法能够有效地降低深度强化学习网络的训练步数,能充分利用边缘计算资源提升任务处理的实时性,降低系统能耗.

Abstract

To solve the problems faced by the existing edge computing task scheduling based on deep reinforcement learning,such as fixed action space exploration,low sample efficiency,large memory demand and poor stability and to better carry out effective task scheduling in the edge computing system with relatively limited computing resources,an adaptive edge computing task scheduling method D3DQN-CAA is proposed based on the improved deep reinforcement learning model D3DQN(Dueling Double DQN).In the task offloading decision,the corresponding relationship between the task and processor is regarded as a multidimensional knapsack problem,and the computing node with the highest matching degree is selected for task processing according to the state information of the current scheduled task and the computing node;For improving the parameters updating efficiency of the evaluation network and reducing the influence of overestimation,a comprehensive Q-value calculation method is proposed;For accelerating the convergence speed of neural networks,an adaptive dynamic exploration degree of action space adjustment strategy is proposed;For reducing the storage resources required and improving the sample efficiency,an adaptive lightweight prioritized playback mechanism is proposed.Experimental results show that compared with multiple benchmark algorithms,the D3DQN-CAA algorithm can effectively reduce the number of training steps of deep reinforcement learning networks and make full use of edge computing resources to improve the real-time performance of task processing and reduce the system energy consumption.

关键词

边缘计算/任务调度/深度Q学习/深度强化学习

Key words

edge computing/task scheduling/deep Q-learning/deep reinforcement learning

引用本文复制引用

基金项目

国家自然科学基金资助项目(61862037)

国家自然科学基金资助项目(62262038)

甘肃省科技计划项目(23CXGA0028)

甘肃省自然科学基金资助项目(22JR5RA356)

出版年

2024
湖南大学学报(自然科学版)
湖南大学

湖南大学学报(自然科学版)

CSTPCD北大核心
影响因子:0.651
ISSN:1674-2974
参考文献量6
段落导航相关论文