首页|基于强化学习的大规模多模Mesh网络联合路由选择及资源调度算法

基于强化学习的大规模多模Mesh网络联合路由选择及资源调度算法

扫码查看
为了平衡新型电力系统中大规模多模Mesh网络的传输可靠性和效率,该文在对优化问题进行描述和分析的基础上提出一种基于强化学习的大规模多模Mesh网络联合路由选择及资源调度算法,分为两个阶段.在第1阶段中,根据网络拓扑结构信息和业务需求,利用一种多条最短路径路由算法,输出所有最短路径.在第2阶段中,提出一种基于多臂老虎机(MAB)的资源调度算法,该算法基于得到的最短路径集合构建MAB的摇臂,然后根据业务需求计算回报,最终给出最优的路由选择及资源调度方式用于业务传输.仿真结果表明,所提算法能够满足不同的业务传输需求,实现端到端路径的平均时延和平均传输成功率的高效平衡.
Joint Routing and Resource Scheduling Algorithm for Large-scale Multi-mode Mesh Networks Based on Reinforcement Learning
In order to balance the transmission reliability and efficiency of large-scale multi-mode mesh networks in the new power system,a two-stage algorithm is proposed based on reinforcement learning for joint routing selection and resource scheduling in large-scale multi-mode mesh networks,building upon the description and analysis of optimization problems.In the first stage,based on the network topology information and service requirements,a multi shortest path routing algorithm is utilized to generate all the shortest paths.In the second stage,a resource scheduling algorithm based on Multi-Armed Bandit(MAB)is proposed.The algorithm constructs the arms of the MAB based on the obtained set of shortest paths,then calculates the reward according to the service demands,and finally gives the optimal route selection and resource scheduling mode for service transmission.Simulation results show that the proposed algorithm can meet different service transmission requirements,and achieve an efficient balance between the average end-to-end path delay and the average transmission success rate.

Mesh networksRouting selectionResource schedulingMulti-Armed Bandit(MAB)Reinforcement learning

朱晓荣、贺楚闳

展开 >

南京邮电大学通信与信息工程学院 南京 210003

Mesh网络 路由选择 资源调度 多臂老虎机 强化学习

国家自然科学基金国家自然科学基金江苏省重点研发计划

9236710292067101BE2021013-3

2024

电子与信息学报
中国科学院电子学研究所 国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCD北大核心
影响因子:1.302
ISSN:1009-5896
年,卷(期):2024.46(7)