基于强化学习的无人机网络资源分配研究

扫码查看

原文链接

万方数据
维普

中文摘要：以无人机网络的资源分配为研究对象,研究了基于强化学习的多无人机网络动态时隙分配方案,在无人机网络中,合理地分配时隙资源对改善无人机资源利用率具有重要意义;针对动态时隙分配问题,根据调度问题的限制条件,建立了多无人机网络时隙分配模型,提出了一种基于近端策略优化(PPO)强化学习算法的时隙分配方案,并进行强化学习算法的环境映射,建立马尔可夫决策过程(MDP)模型与强化学习算法接口相匹配;在gym仿真环境下进行模型训练,对提出的时隙分配方案进行验证,仿真结果验证了基于近端策略优化强化学习算法的时隙分配方案在多无人机网络环境下可以高效进行时隙分配,提高网络信道利用率,提出的方案可以根据实际需求适当缩短训练时间得到较优分配结果.

外文标题：Research on Resource Allocation in UAV Networks Based on Reinforcement Learning

外文摘要：The resource allocation of UAV networks is taken as a research object,a dynamic time slot allocation scheme in multi-UAV networks based on reinforcement learning is investigated.In UAV networks,it is important to reasonably allocate time slot re-sources to improve UAV resource utilization.Aiming at the dynamic time slot allocation problem,a time slot allocation model for multi-UAV networks is established according to the constraints of the scheduling problem.A time slot allocation scheme based on the proximal policy optimization(PPO)reinforcement learning algorithm is proposed to carry out the environment mapping of the rein-forcement learning algorithm,and build a Markov decision process(MDP)model to match the interface of the reinforcement learning algorithm.The model training is performed in the Gym simulation environment to validate the proposed time slot allocation scheme.The simulation results show that based on the proximal policy optimization reinforcement learning algorithm,the time slot allocation scheme can efficiently perform the time slot allocation and improve the network channel utilization in a multi-UAV network environ-ment.The proposed scheme appropriately reduces the training time according to actual demands,which obtains optimal allocation re-sults.

外文关键词：

deep reinforcement learningmulti-UAV networksdynamic time slot allocationresource allocationproximal policy optimization

作者：

范文帝、王俊芳、党甜、杜龙海、陈丛

展开 >

作者单位：

中国电子科技集团公司第54研究所,石家庄 050081

关键词：

深度强化学习多无人机网络动态时隙分配资源分配近端策略优化

基金：

国防基础科研计划资助

项目编号：

JCKY2020210B021

出版年：

2024

DOI：

10.16526/j.cnki.11-4762/tp.2024.01.042

计算机测量与控制

中国计算机自动测量与控制技术协会

计算机测量与控制

CSTPCD

影响因子：0.546

ISSN：1671-4598

年,卷(期)：2024.32(1)

参考文献量8