首页|基于深度强化学习的战术通信网络路径优选算法

基于深度强化学习的战术通信网络路径优选算法

扫码查看
针对作战业务的异质性、链路类型的多样性以及链路状态的时变性,使得满足业务服务质量需求的端到端传输路径的决策空间指数级上升,导致业务与最优路径的匹配更加困难的问题,提出了一种基于深度强化学习的战术通信网络路径优选算法(DRL-ST).DRL-ST通过Dueling DQN构建了端到端传输路径决策模型,并利用SumTree存储结构对采样机制进行优化,以提升模型的收敛速度;进一步,在对传输路径端到端QoS参数进行刻画的基础上,构建基于多业务特征的奖励函数,实现了业务服务质量需求与传输路径的最优匹配.实验结果表明,与传统算法相比,DRL-ST在满足业务服务质量需求的同时,端到端时延和丢包率分别降低了 16.78%和28.43%,且吞吐量最大提升了 20.36%.
Tactical Communication Network Path Selection Algorithm Based on Deep Reinforcement Learning
Aiming at the problem that the heterogeneity of combat services,the diversity of link types,and the time-varying nature of link state make the decision-making space of end-to-end transmission paths that meet the service quality requirements rise exponentially,which makes it more difficult to match the service with the optimal path,a tactical communication network path optimization algorithm based on deep reinforcement learning(DRL-ST)is proposed.DRL-ST constructs an end-to-end transmission path decision model through Dueling DQN and uses the SumTree storage structure to optimize the sampling mechanism to improve the convergence speed of the model.Furthermore,on the basis of describing the end-to-end QoS parameters of the transmission path,a reward function based on multi-service characteris-tics is constructed to realize the optimal matching between the service quality requirements and the trans-mission path.The experimental results show that compared with the traditional algorithm,DRL-ST not only meets the service quality requirements but also reduces the end-to-end delay and packet loss rate by 16.78%and 28.43%,respectively,and the throughput is increased by 20.36%at most.

deep reinforcement learningsoftware defined networkpath optimizationquality of service

潘成胜、曹康宁、石怀峰、王英植

展开 >

南京信息工程大学电子与信息工程学院,江苏南京 210044

南京理工大学 自动化学院,江苏南京 210094

深度强化学习 软件定义网络 路径优选 服务质量

国家自然科学基金

61931004

2024

中国电子科学研究院学报
中国电子科学研究院

中国电子科学研究院学报

影响因子:0.663
ISSN:1673-5692
年,卷(期):2024.19(2)
  • 21