Research on Resource Allocation in UAV Networks Based on Reinforcement Learning
The resource allocation of UAV networks is taken as a research object,a dynamic time slot allocation scheme in multi-UAV networks based on reinforcement learning is investigated.In UAV networks,it is important to reasonably allocate time slot re-sources to improve UAV resource utilization.Aiming at the dynamic time slot allocation problem,a time slot allocation model for multi-UAV networks is established according to the constraints of the scheduling problem.A time slot allocation scheme based on the proximal policy optimization(PPO)reinforcement learning algorithm is proposed to carry out the environment mapping of the rein-forcement learning algorithm,and build a Markov decision process(MDP)model to match the interface of the reinforcement learning algorithm.The model training is performed in the Gym simulation environment to validate the proposed time slot allocation scheme.The simulation results show that based on the proximal policy optimization reinforcement learning algorithm,the time slot allocation scheme can efficiently perform the time slot allocation and improve the network channel utilization in a multi-UAV network environ-ment.The proposed scheme appropriately reduces the training time according to actual demands,which obtains optimal allocation re-sults.
deep reinforcement learningmulti-UAV networksdynamic time slot allocationresource allocationproximal policy optimization