UAV Path Planning and Radio Mapping Based on Deep Reinforcement Learning
To address the limitations of traditional UAV trajectory optimization design methods in building communication models,this paper presents a deep reinforcement learning-based UAV path planning and radio mapping in cellular-connected UAV com-munication systems.The proposed method utilizes an extended double-deep Q-network(DDQN)model combined with a radio prediction network to generate UAV trajectories and predict the reward values accumulated due to action selection.Furthermore,the method trains the DDQN model by combining actual and simulated flights based on Dyna framework,which greatly improves the learning efficiency.Simulation results show that the proposed method utilizes the learned coverage area probability map more effectively compared to the Direct-RL algorithm,enabling the UAV to avoid weak coverage areas and reducing the weighted sum of flight time and expected interruption time.