Abstract
Unmanned aerial vehicles(UAVs)are recognized as effective means for delivering emergency communication services when terrestrial infrastructures are unavailable.This paper investigates a multi-UAV-assisted communication system,where we jointly optimize UAVs'trajectories,user association,and ground users(GUs)'transmit power to maximize a defined fairness-weighted throughput metric.Owing to the dynamic nature of UAVs,this problem has to be solved in real time.However,the problem's non-convex and combinatorial attributes pose challenges for conventional optimization-based algorithms,particularly in scenarios without central controllers.To address this issue,we propose a multi-agent deep reinforcement learning(MADRL)approach to provide distributed and online solutions.In contrast to previous MADRL-based methods considering only UAV agents,we model UAVs and GUs as heterogeneous agents sharing a common objective.Specifically,UAVs are tasked with optimizing their trajectories,while GUs are responsible for selecting a UAV for association and determining a transmit power level.To learn policies for these heterogeneous agents,we design a heterogeneous coordinated QMIX(HC-QMIX)algorithm to train local Q-networks in a centralized manner.With these well-trained local Q-networks,UAVs and GUs can make individual decisions based on their local observations.Extensive simulation results demonstrate that the proposed algorithm outperforms state-of-the-art benchmarks in terms of total throughput and system fairness.
基金项目
国家自然科学基金(62371462)
国家自然科学基金(61931020)
国家自然科学基金(62101569)
国家自然科学基金(U19B2024)
湖南省自然科学基金(2022J J10068)
Science and Technology Innovation Program of Hunan Province(2022RC1093)