基于强化学习的无人机安全避障与围捕制导

Reinforcement learning-based safety obstacle avoidance and capture guidance for UAV

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对无人机在受约束环境下面临绕飞障碍物与跟踪目标相互掣肘的问题,提出了一种基于强化学习的无人机安全避障与围捕制导方法.根据极坐标原理设计环绕跟踪控制器,驱使无人机在GPS拒止的情况下到达预设的圆形轨道.将环绕约束和障碍物约束转化为马尔可夫过程,以速度、径向误差、角速度误差和碰撞函数为状态空间,以控制器的补偿量为动作空间,设计考虑跟踪误差和碰撞概率的奖励函数,利用深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法对智能体进行训练,增强跟踪效果并获得避碰能力,实现无人机对静止/运动目标的环绕跟踪;此外,在训练过程中引入课程学习,将过去的学习策略转移到当前事件,与经典的随机参数设置相比,具有更快的收敛速度.最后仿真表明,所提算法可以引导无人机圆形环绕控制的同时高效规避障碍物.

外文摘要：To solve the problem that the unmanned aerial vehicle(UAV)faces the mutual constraint between the flying obstacle and the target tracking in the constrained environment,a method of reinforcement learning-based safety obstacle avoidance and capture guidance for UAV is proposed.According to the principle of polar coordinates,the surround tracking controller is designed to drive the UAV to a preset circular orbit in GPS-denied presence.The surround constraint and obstacle avoidance constraint are all transformed into the Markov process,taking velocity,radial error,angular velocity error and obstacle function as state space,and the compensation of control as action space.The reward function considering radial error and obstacle probability is designed.The tracking effect is enhanced and the obstacle avoidance ability is obtained by virtue of the deep deterministic policy gradient(DDPG)algorithm to train the generated agent,and then the UAV surround tracking of stationary/moving targets is realized.Additionally,the introduction of the course learning in the training process transfers past learning strategies to current events and has a faster convergence rate compared to the classical random parameter settings.Finally,the simulation results show that the proposed algorithm can guide the UAV to elliptical surround control and avoid obstacles efficiently.

外文关键词：

reinforcement learningobstacle avoidanceunmanned aerial vehiclestarget trackingencircleGPS-denied

作者：

梅泽伟、邵星灵、刘俊

展开 >

作者单位：

中北大学,太原 030051

关键词：

强化学习避障无人机目标跟踪环绕 GPS拒止

基金：

国家自然科学基金

项目编号：

61803348

出版年：

2024

DOI：

10.16358/j.issn.1009-1300.20230155

战术导弹技术

中国航天科工飞航技术研究院（中国航天科工集团第三研究院）

战术导弹技术

CSTPCD北大核心

影响因子：0.304

ISSN：1009-1300

年,卷(期)：2024.(2)

参考文献量12