首页|深度确定性策略梯度下运动目标识别及无人机跟随

深度确定性策略梯度下运动目标识别及无人机跟随

扫码查看
针对无人机(unmanned aerial vehicle,UAV)平台采集运动目标图像信息过程中因UAV自身的飞行状态、环境的干扰、目标的随机性等原因易产生运动目标丢失等问题,提出了一种基于运动目标识别的深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法UAV跟随方法.面向高速公路的车辆目标,分析了 UAV高度、位姿与高速车辆运动之间的关系,建立了移动平台目标检测帧率的速度自适应模型,根据目标的运动状态计算能够相匹配UAV的飞行状态,实时修正飞行姿态与速度,使UAV能够保持与目标的相对位置和角度.继而基于DDPG算法价值网络估计UAV在不同状态下采取特定动作的价值,策略网络生成UAV在给定状态下采取动作的策略,给予UAV飞行高度、速度控制参数用于目标跟踪,使UAV能够根据目标的运动变化自动调节飞行状态,实现运动目标的自适应跟随.仿真实验表明:DDPG算法能够提供稳定的飞行姿态数据,为UAV的跟随任务提供了可靠的控制基础;通过在真实场景下实验验证,UAV能够实时跟踪速度范围0~33 m/s、半径为120 m的圆形面积内的地面运动目标,且在续航范围内能够实现持续稳定跟随.
Moving target recognition and unmanned aerial vehicle following based on deep deterministic policy gradient
In order to solve the problems of loss of moving targets due to the flight status of the unmanned aerial vehicle(UAV)itself,the interference of the environment,the randomness of the target and other reasons in the process of collecting moving target image information by the UAV platform,a deep deterministic policy gradient(DDPG)algorithm UAV following method based on moving target recognition was proposed.Facing the vehicle target on the highway,the relationship among the height,posture and high-speed vehicle motion of the UAV was analyzed,the velocity adaptive model of the target detection frame rate of the mobile platform was estab-lished,and the flight attitude and speed of the UAV were corrected in real time according to the motion state of the target,so that the UAV could maintain the relative position and angle with the target.Then,based on the value network of DDPG algorithm,the value of UAV taking spe-cific actions in different states was estimated;the strategy network generates the strategy of UAV taking actions in a given state,and gives UAV flight altitude and speed control parameters for target tracking,so that the UAV could automatically adjust the flight state according to the movement change of the target,and realize the adaptive tracking of the moving target.Simulation experiments show that the DDPG algorithm can provide stable flight attitude data and a reliable control basis for the UAV following task,and the UAV can track the ground moving target in a circular area with a speed range of 0~33 m/s and a radius of 120 m in real time,and can achieve continuous and stable tracking within the endurance range.

quadrotorfreewaydynamic programmingdeep deterministic policy gradient(DDPG)target tracking

刘欣、张倩飞、刘成宇、高涵

展开 >

空军工程大学信息与导航学院,陕西西安 710038

西安欧亚学院信息工 程学院,陕西西安 710068

西安工程大学电子信息学院,陕西西安 710048

四轴飞行器 高速公路 动态规划 深度确定性策略梯度 目标跟踪

2024

西安工程大学学报
西安工程大学

西安工程大学学报

CSTPCD
影响因子:0.473
ISSN:1674-649X
年,卷(期):2024.38(4)