基于强化学习的大规模多模Mesh网络联合路由选择及资源调度算法

扫码查看

原文链接

万方数据
维普

中文摘要：为了平衡新型电力系统中大规模多模Mesh网络的传输可靠性和效率,该文在对优化问题进行描述和分析的基础上提出一种基于强化学习的大规模多模Mesh网络联合路由选择及资源调度算法,分为两个阶段.在第1阶段中,根据网络拓扑结构信息和业务需求,利用一种多条最短路径路由算法,输出所有最短路径.在第2阶段中,提出一种基于多臂老虎机(MAB)的资源调度算法,该算法基于得到的最短路径集合构建MAB的摇臂,然后根据业务需求计算回报,最终给出最优的路由选择及资源调度方式用于业务传输.仿真结果表明,所提算法能够满足不同的业务传输需求,实现端到端路径的平均时延和平均传输成功率的高效平衡.

外文标题：Joint Routing and Resource Scheduling Algorithm for Large-scale Multi-mode Mesh Networks Based on Reinforcement Learning

外文摘要：In order to balance the transmission reliability and efficiency of large-scale multi-mode mesh networks in the new power system,a two-stage algorithm is proposed based on reinforcement learning for joint routing selection and resource scheduling in large-scale multi-mode mesh networks,building upon the description and analysis of optimization problems.In the first stage,based on the network topology information and service requirements,a multi shortest path routing algorithm is utilized to generate all the shortest paths.In the second stage,a resource scheduling algorithm based on Multi-Armed Bandit(MAB)is proposed.The algorithm constructs the arms of the MAB based on the obtained set of shortest paths,then calculates the reward according to the service demands,and finally gives the optimal route selection and resource scheduling mode for service transmission.Simulation results show that the proposed algorithm can meet different service transmission requirements,and achieve an efficient balance between the average end-to-end path delay and the average transmission success rate.

外文关键词：

Mesh networksRouting selectionResource schedulingMulti-Armed Bandit(MAB)Reinforcement learning

作者：

朱晓荣、贺楚闳

展开 >

作者单位：

南京邮电大学通信与信息工程学院南京 210003

关键词：

Mesh网络路由选择资源调度多臂老虎机强化学习

基金：

国家自然科学基金国家自然科学基金江苏省重点研发计划

项目编号：

9236710292067101BE2021013-3

出版年：

2024

DOI：

10.11999/JEIT231103

电子与信息学报

中国科学院电子学研究所国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCD北大核心

影响因子：1.302

ISSN：1009-5896

年,卷(期)：2024.46(7)