未知环境下无人机编队智能避障控制方法

扫码查看

原文链接

万方数据
维普

中文摘要：为保障固定翼无人机编队在未知障碍环境下的安全飞行,该文针对固定翼无人机编队飞行控制方法展开研究.在深度确定性策略梯度(deep deterministic policy gradient,DDPG)的基础上,引入贪婪选择构建了 Greedy-DDPG算法,训练长机模型实现避障控制;并结合人工势场的方法和领从一致性设计了僚机群避障避碰控制策略,确保僚机能够规避障碍,跟随长机执行飞行任务.数值仿真实验结果显示,Greedy-DDPG算法的训练时长比DDPG算法的缩短了 5.9％,避障的泛化能力得到提升;Monte Carlo仿真实验验证结果显示,该方法具有良好的鲁棒性.采用该方法可实现无人机编队协同飞行,对于保障无人机编队在未知环境中的飞行安全具有重要意义.

外文标题：Intelligent obstacle avoidance control method for unmanned aerial vehicle formations in unknown environments

外文摘要：[Objective]Formations of fixed-wing unmanned aerial vehicles(UAVs),which are commonly used in military,rescue,and other missions,often do not have the ability to hover and have a large turning radius.Thus,when operating in an unknown environment,it is easy for the formations to collide in the presence of obstacles,which will gravely affect flight safety if not guarded against.It is difficult to avoid unknown environmental obstacles using traditional modeling methods.However,artificial potential field methods can address deadlock problems such as target infeasibility and cluster congestion.[Methods]To achieve the cooperation of UAV formations without collision,a deep deterministic policy gradient(DDPG)-based centralized UAV formation control method is proposed in this study,which is designed by combining the centralized communication architecture,reinforcement learning,and artificial potential field method.First,a greedy-DDPG flight control method is studied for leader UAVs,which improves collision avoidance effectiveness.Considering maneuver constraints,reward functions,action spaces,and state spaces are improved.Additionally,to shorten the training duration,the exploration strategy of DDPG is improved using the greedy scheme.This improvement mainly uses the critic network to evaluate the value of random action groups and improves greedy selection to make actions more inclined,thus achieving rapid updates regarding the critic network and accelerating the update of the overall network.Based on this,incorporated with the artificial potential field method and leader-follower consensus,a collision-free control method is designed for followers,which can ensure collision-free following cooperation.[Results]The numerical simulation experimental results show that the improved DDPG algorithm has a 5.9％shorter training time than the original algorithm.In the same scenario,the method that we proposed perceives the same number of obstacles as the artificial potential field method.The artificial potential field method has significant fluctuations in heading angle,while the proposed method has relatively small fluctuations.The DDPG algorithm has a smoother heading angle due to a smaller number of perceived obstacles;however,the minimum distance from the obstacles is only 9.1 m.The method that we proposed here is above 17 m from the obstacles.Furthermore,Monte Carlo experimental data under different scenarios of the long aircraft show that the ability of obstacle avoidance generalization of the proposed method is improved.Moreover,experiments were applied to the proposed formation control method.Under the same scenario and control parameters,the UAV formation control method based on the proposed architecture has lower formation errors during flight,with a maximum error of no more than 10 m.However,the artificial potential field-based formation control method has a maximum formation error of over 25 m.When encountering narrow gaps,our proposed method can quickly pass through without congestion,while the artificial potential field-based formation control method appears to hover in front of obstacles,which is not conducive to flight safety.During the entire flight,this method has a greater distance from obstacles and higher safety.[Conclusions]Compared with the original DDPG algorithm,the improved DDPG algorithm has faster training speed and better training effect.The formation control method can realize the formation flight of unmanned aerial vehicles under unknown obstacles.Compared with the formation control method based on artificial potential field,the formation control method avoids the hovering in place before obstacles,which is of great significance to the formation flight safety of unmanned aerial vehicles.

外文关键词：

formation controlavoiding obstacles and collisionsreinforcement learningcentralized collaboration

作者：

黄号、马文卉、李家诚、方洋旺

展开 >

作者单位：

西北工业大学无人系统技术研究院,西安 710072

西北工业大学自动化学院,西安 710072

关键词：

编队控制避障避碰强化学习集中式协同

基金：

国家自然科学基金面上项目

项目编号：

61973253

出版年：

2024

DOI：

10.16511/j.cnki.qhdxxb.2023.27.001

清华大学学报(自然科学版)

清华大学

清华大学学报(自然科学版)

CSTPCD北大核心

影响因子：0.586

ISSN：1000-0054

年,卷(期)：2024.64(2)

参考文献量23