首页|深度强化学习的低轨卫星资源分配

深度强化学习的低轨卫星资源分配

扫码查看
针对低轨道(low earth orbit,LEO)卫星中无法保持能量效率(energy efficiency,EE)和频谱效率(spectral efficiency,SE)优化增长趋势一致的问题,提出了一种优化LEO中EE和SE之间权衡问题的方法.对LEO资源分配场景建模,采用划分时隙的方式简化动态模型,通过调节子载波功率来优化吞吐量,从而优化EE和SE.通过引入权值因子、EE与SE量纲统一的方式对EE和SE进行加权,达到求二者最大平衡的目的.为应对较大的状态动作空间问题,采用深度对决Q网络(Dueling DQN)算法实现较优的控制策略.仿真结果表明,较其他深度强化学习(deep reinforcement learning,DRL)算法,所提算法收敛速度更快;对比其他DRL算法收敛值分别提高10.1%和18.2%.噪声功率发生变化时,Dueling DQN所求SE较其他DRL算法提高了15.6%.
Resource Allocation of Low Earth Orbit Satellites Based on Deep Reinforcement Learning
Aiming at the problem that the optimization of energy efficiency(EE)and spectrum efficien-cy(SE)in low earth orbit(LEO)satellites cannot maintain a consistent growth trend,a method for opti-mizing the trade-off between EE and SE in LEO satellites is proposed.This method models the LEO satel-lite resource allocation scenarios,simplifies the dynamic model by dividing time slots,and optimizes the throughput by adjusting sub-carrier power,thereby optimizing EE and SE.In addition,EE and SE are weighted by introducing weight factors and unifying the units of EE and SE,so as to achieve the maximum balance between them.In order to deal with the large state action space problem,the Dueling DQN algo-rithm is used to achieve a better control strategy.Simulation results show that compared with other deep reinforcement learning(DRL)algorithms,the proposed algorithm converges faster and the convergence value is increased by 10.1%and 18.2%higher,respectively.When the noise power changes,the SE ob-tained by Dueling DQN is increased by 15.6%,compared with other DRL algorithms.

low earth orbit(LEO)satelliteresource allocationdeep reinforcement learning(DRL)energy efficiency(EE)spectral efficiency(SE)

王文浩、陈发堂、徐霄鹏、周玉前

展开 >

重庆邮电大学 通信与信息工程学院,重庆 400065

低轨道卫星 资源分配 深度强化学习 能量效率 频谱效率

2024

陆军工程大学学报
解放军理工大学科研部

陆军工程大学学报

影响因子:0.556
ISSN:2097-0730
年,卷(期):2024.3(6)