针对对抗环境下无人机集群协同信息采集任务面临的环境结构复杂、集群通信受阻等难题,提出一种基于多层次混合观测空间与注意力机制的深度强化学习(Multi-Level hybrid obser-vation space with Attention-Deep Reinforcement Learning,MLAT-DRL)算法,用于信息采集任务中无人机的决策.采用集中式训练、分布式执行(Centralized Training with Decentralized Execution,CTDE)范式,实现无通信条件下无人机集群的高效协同;提出多层次混合观测空间方法,形成环境特征的多尺度表达,实现了对全局信息和局部观测的高效利用;在算法网络结构中引入结合注意力(At-tention)机制的循环神经网络(Recurrent Neural Network,RNN),提高了无人机集群的风险感知能力;采用优先经验回放(Priority Experience Replay,PER)策略,提高样本利用率,降低训练难度.经过仿真实验验证,MLAT-DRL算法在数据采集和风险规避等方面效果均优于基线算法.
Collaborative Regional Information Collection Strategy Based on MLAT-DRL Algorithm
Aiming at the difficulties faced by UAV swarm collaborative regional information collection in adversarial environment(e.g.,complex environment structure and blocked swarm communication),a multi-level hybrid observation space with attention-deep reinforcement learning(MLAT-DRL)is proposed for decision making of UAV in information collection task.The proposed algorithm adopts a centralized training with decentralized execution paradigm,which realizes the efficient collaboration of UAV swarm in the absence of communications.In addition,a multi-level hybrid observation space method is proposed to develop the multi-scale representations of environmental features and realize the efficient use of global information and local observation.Moreover,the algorithm introduces a recurrent neural network incorporating an attention mechanism in the network,which improves the risk perception ability of UAV swarm.A prioritized experience replay strategy is employed to improve the utilization rate of samples and reduces the difficulty of training.It is verified from simulations that the proposed MLAT-DRL algorithm outperforms baseline algorithms in terms of data collection and risk aversion.