中国电子科学研究院学报2024,Vol.19Issue(2) :138-148.DOI:10.3969/j.issn.1673-5692.2024.02.005

基于深度强化学习的战术通信网络路径优选算法

Tactical Communication Network Path Selection Algorithm Based on Deep Reinforcement Learning

潘成胜 曹康宁 石怀峰 王英植
中国电子科学研究院学报2024,Vol.19Issue(2) :138-148.DOI:10.3969/j.issn.1673-5692.2024.02.005

基于深度强化学习的战术通信网络路径优选算法

Tactical Communication Network Path Selection Algorithm Based on Deep Reinforcement Learning

潘成胜 1曹康宁 1石怀峰 2王英植1
扫码查看

作者信息

  • 1. 南京信息工程大学电子与信息工程学院,江苏南京 210044
  • 2. 南京信息工程大学电子与信息工程学院,江苏南京 210044;南京理工大学 自动化学院,江苏南京 210094
  • 折叠

摘要

针对作战业务的异质性、链路类型的多样性以及链路状态的时变性,使得满足业务服务质量需求的端到端传输路径的决策空间指数级上升,导致业务与最优路径的匹配更加困难的问题,提出了一种基于深度强化学习的战术通信网络路径优选算法(DRL-ST).DRL-ST通过Dueling DQN构建了端到端传输路径决策模型,并利用SumTree存储结构对采样机制进行优化,以提升模型的收敛速度;进一步,在对传输路径端到端QoS参数进行刻画的基础上,构建基于多业务特征的奖励函数,实现了业务服务质量需求与传输路径的最优匹配.实验结果表明,与传统算法相比,DRL-ST在满足业务服务质量需求的同时,端到端时延和丢包率分别降低了 16.78%和28.43%,且吞吐量最大提升了 20.36%.

Abstract

Aiming at the problem that the heterogeneity of combat services,the diversity of link types,and the time-varying nature of link state make the decision-making space of end-to-end transmission paths that meet the service quality requirements rise exponentially,which makes it more difficult to match the service with the optimal path,a tactical communication network path optimization algorithm based on deep reinforcement learning(DRL-ST)is proposed.DRL-ST constructs an end-to-end transmission path decision model through Dueling DQN and uses the SumTree storage structure to optimize the sampling mechanism to improve the convergence speed of the model.Furthermore,on the basis of describing the end-to-end QoS parameters of the transmission path,a reward function based on multi-service characteris-tics is constructed to realize the optimal matching between the service quality requirements and the trans-mission path.The experimental results show that compared with the traditional algorithm,DRL-ST not only meets the service quality requirements but also reduces the end-to-end delay and packet loss rate by 16.78%and 28.43%,respectively,and the throughput is increased by 20.36%at most.

关键词

深度强化学习/软件定义网络/路径优选/服务质量

Key words

deep reinforcement learning/software defined network/path optimization/quality of service

引用本文复制引用

基金项目

国家自然科学基金(61931004)

出版年

2024
中国电子科学研究院学报
中国电子科学研究院

中国电子科学研究院学报

CSTPCD
影响因子:0.663
ISSN:1673-5692
参考文献量21
段落导航相关论文