首页|A Recursive DRL-Based Resource Allocation Method for Multibeam Satellite Communication Systems
A Recursive DRL-Based Resource Allocation Method for Multibeam Satellite Communication Systems
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
万方数据
Optimization-based radio resource management(RRM)has shown significant performance gains on high-throughput satellites(HTSs).However,as the number of allocable on-board resources increases,traditional RRM is difficult to apply in real satellite systems due to its intense computational complexity.Deep reinforcement learning(DRL)is a promising solution for the resource allocation problem due to its model-free advantages.Never-theless,the action space faced by DRL increases exponentially with the increase of communication scale,which leads to an excessive exploration cost of the algorithm.In this paper,we propose a recursive frequency resource allocation al-gorithm based on long-short term memory(LSTM)and proximal policy optimization(PPO),called PPO-RA-LOOP,where RA means resource allocation and LOOP means the algorithm outputs actions in a recursive manner.Specifi-cally,the PPO algorithm uses LSTM network to recursively generate sub-actions about frequency resource allocation for each beam,which significantly cuts down the action space.In addition,the LSTM-based recursive architecture allows PPO to better allocate the next frequency resource by using the generated sub-actions information as a prior knowledge,which reduces the complexity of the neural network.The simulation results show that PPO-RA-LOOP achieved higher spectral efficiency and system satisfaction compared with other frequency allocation algorithms.
High-throughput satellitesProximal policy optimizationDeep reinforcement learningLong-short term memory
Haowei MENG、Ning XIN、Hao QIN、Di ZHAO
展开 >
State Key Laboratory of Integrated Services Networks,Xidian University,Xi'an 710071,China
China Academy of Space Technology,Institute of Telecommunication Satellite,Beijing 100094,China
National Natural Science Foundation of ChinaKey Research and Development Program of ShaanxiISN State Key Laboratory