基于MLAT-DRL算法的协同区域信息采集策略

扫码查看

原文链接

万方数据
维普

中文摘要：针对对抗环境下无人机集群协同信息采集任务面临的环境结构复杂、集群通信受阻等难题,提出一种基于多层次混合观测空间与注意力机制的深度强化学习(Multi-Level hybrid obser-vation space with Attention-Deep Reinforcement Learning,MLAT-DRL)算法,用于信息采集任务中无人机的决策.采用集中式训练、分布式执行(Centralized Training with Decentralized Execution,CTDE)范式,实现无通信条件下无人机集群的高效协同;提出多层次混合观测空间方法,形成环境特征的多尺度表达,实现了对全局信息和局部观测的高效利用;在算法网络结构中引入结合注意力(At-tention)机制的循环神经网络(Recurrent Neural Network,RNN),提高了无人机集群的风险感知能力;采用优先经验回放(Priority Experience Replay,PER)策略,提高样本利用率,降低训练难度.经过仿真实验验证,MLAT-DRL算法在数据采集和风险规避等方面效果均优于基线算法.

外文标题：Collaborative Regional Information Collection Strategy Based on MLAT-DRL Algorithm

外文摘要：Aiming at the difficulties faced by UAV swarm collaborative regional information collection in adversarial environment(e.g.,complex environment structure and blocked swarm communication),a multi-level hybrid observation space with attention-deep reinforcement learning(MLAT-DRL)is proposed for decision making of UAV in information collection task.The proposed algorithm adopts a centralized training with decentralized execution paradigm,which realizes the efficient collaboration of UAV swarm in the absence of communications.In addition,a multi-level hybrid observation space method is proposed to develop the multi-scale representations of environmental features and realize the efficient use of global information and local observation.Moreover,the algorithm introduces a recurrent neural network incorporating an attention mechanism in the network,which improves the risk perception ability of UAV swarm.A prioritized experience replay strategy is employed to improve the utilization rate of samples and reduces the difficulty of training.It is verified from simulations that the proposed MLAT-DRL algorithm outperforms baseline algorithms in terms of data collection and risk aversion.

外文关键词：

unmanned aerial vehicle swarmregional information collectionmulti-agent reinforcement learningmulti-level hybrid observation spaceattention mechanism

作者：

娄抒瀚、王冲冲、龚炜、邓立原、李莉

展开 >

作者单位：

同济大学电子与信息工程学院,上海 201804

同济大学上海自主智能无人系统科学中心,上海 201804

关键词：

无人机集群区域信息采集多智能体强化学习多层次混合观测空间注意力机制

出版年：

2024

DOI：

10.12382/bgxb.2023.1081

兵工学报

中国兵工学会

兵工学报

CSTPCD北大核心

影响因子：0.735

ISSN：1000-1093

年,卷(期)：2024.45(12)