首页|基于李雅普诺夫优化和深度强化学习的多任务端边迁移

基于李雅普诺夫优化和深度强化学习的多任务端边迁移

扫码查看
针对多终端、多边缘服务器场景下异构工业任务的端边协同处理问题,提出一种基于李雅普诺夫优化和深度强化学习的多任务端边迁移算法.首先,以联合优化任务迁移决策、迁移比例和传输功率为目标,充分考虑计算频率、传输功率、长期能耗和任务截止期等约束,构建系统长期平均开销最小化问题;由于问题中长期目标及约束中变量在不同时隙相互耦合,难以求解,基于李雅普诺夫优化理论,将长期平均开销最小化问题解耦为独立时隙的策略优化问题;通过马尔可夫决策过程建模,并采用双层竞争深度神经网络架构,提出基于深度强化学习的多任务迁移算法.实验结果表明,所提算法能够稳定收敛,并在长期能耗约束和任务截止期要求下有效降低系统长期平均开销.
Multi-task end-edge offloading based on Lyapunov optimization and deep reinforcement learning
To enable collaborative processing of heterogeneous industrial tasks in the scenario with multiple devices and multiple edge servers,this paper proposes a multi-task end-edge offloading algorithm based on Lyapunov optimization and deep reinforcement learning.First,to jointly optimize task offloading decision,offloading ratio and transmit power,a long-term average system overhead minimization problem is formulated with full consideration of computing frequency,transmission power,long-term energy consumption and task deadline.As variables are coupled among different time slots in the long-term objective and constraints,the problem is difficult to solve.Thus,the long-term average system overhead minimization problem is decoupled into some independent time-slot optimization problems based on the Lyapunov optimization theory.By Markov decision process modelling and employing a double dueling deep neural network architecture,a deep reinforcement learning-based multi-task offloading algorithm is proposed.Experiments show that the proposed algorithm can converge stably,and can effectively reduce the long-term average system overhead under long-term energy consumption constraints and task deadline requirements.

heterogeneous industrial taskstask offloadingLyapunov optimizationMarkov decision processdeep reinforcement learnings

许驰、唐紫萱、金曦、夏长清

展开 >

中国科学院沈阳自动化研究所机器人学国家重点实验室,沈阳 110016

中国科学院网络化控制系统重点实验室,沈阳 110016

中国科学院机器人与智能制造创新研究院,沈阳 110169

中国科学院大学,北京 100049

展开 >

异构工业任务 任务迁移 李雅普诺夫优化 马尔可夫决策过程 深度强化学习

国家自然科学基金国家自然科学基金国家自然科学基金国家自然科学基金辽宁省科技计划辽宁省科技计划辽宁省科技计划中国科学院青年创新促进会项目中国科学院青年创新促进会项目中国科学院青年创新促进会项目

922671086217332262133014619723892023JH3/102000042023JH3/102000062022JH25/1010000520192022020207Y2021062

2024

控制与决策
东北大学

控制与决策

CSTPCD北大核心
影响因子:1.227
ISSN:1001-0920
年,卷(期):2024.39(7)
  • 2