基于深度强化学习的空天地一体化网络资源分配算法

A Resource Allocation Algorithm for Space-Air-Ground Integrated Network Based on Deep Reinforcement Learning

刘雪芳 ¹毛伟灏 ¹杨清海¹

扫码查看

作者信息

1. 西安电子科技大学通信工程学院西安 710071
折叠

摘要

空天地一体化网络(SAGIN)通过提高地面网络的资源利用率可以有效满足多种业务类型的通信需求,然而忽略了系统的自适应能力和鲁棒性及不同用户的服务质量(QoS).针对这一问题,该文提出在空天地一体化网络架构下,面向城区和郊区通信的深度强化学习(DRL)资源分配算法.基于第3代合作伙伴计划(3GPP)标准中定义的用户参考信号接收功率(RSRP),考虑地面同频干扰情况,以不同域中基站的时频资源作为约束条件,构建了最大化系统用户的下行吞吐量优化问题.利用深度Q网络(DQN)算法求解该优化问题时,定义了能够综合考虑用户服务质量需求、系统自适应能力及系统鲁棒性的奖励函数.仿真结果表明,综合考虑无人驾驶汽车,沉浸式服务及普通移动终端通信业务需求时,表征系统性能的奖励函数值在2 000次迭代下,相较于贪婪算法提升了39.1％;对于无人驾驶汽车业务,利用DQN算法进行资源分配后,相比于贪婪算法,丢包数平均下降38.07％,时延下降了6.05％.

Abstract

The Space-Air-Ground Integrated Network(SAGIN)can effectively meet the communication needs of various service types by improving the resource utilization of the ground network,but ignoring the adaptive ability and robustness of the system and the Quality of Service(QoS)in different users.In response to this problem,a Deep Reinforcement Learning(DRL)Resource allocation algorithm for urban and suburban communications under the SAGIN architecture is proposed in this paper.Based on Reference Signal Reception Power(RSRP)defined in the 3rd Generation Partnership Project(3GPP)standard,considering ground co-frequency interference,and using the time-frequency resources of base stations in different domains as constraints,an optimization problem to maxmize the downlink throughput of system users is constructed.When using the Deep Q-network(DQN)algorithm to solve the optimization problem,a reward function which can comprehensively consider the user's QoS requirements,system adaptability and system robustness is defined.Considering the service requirements of unmanned vehicles,immersive services and ordinary mobile communication services,the simulation results show that the value of the reward function which represents the performance of the system is increased by 39.1％compared with the greedy algorithm under 2 000 iterations.For the unmanned vehicle services,the average packet loss rate by the DQN algorithm is 38.07％lower than that by the greedy algorithm,and the delay by the DQN algorithm is also 6.05％lower than that by the greedy algorithm.

关键词

空天地一体化网络/资源分配算法/深度强化学习/深度Q网络

Key words

Space-Air-Ground Integrated Network(SAGIN)/Resource allocation/Deep Reinforcement Learning(DRL)/Deep Q-Network(DQN)

引用本文复制引用

基金项目

国家重点研发计划(2020YFB1807700)

出版年

2024

电子与信息学报

中国科学院电子学研究所国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCD北大核心

影响因子：1.302

ISSN：1009-5896

参考文献量9

段落导航