基于PPO算法的四旋翼无人机位置控制

Position Control of Quadrotor Based on PPO Algorithm

杨宗月 ¹刘磊 ¹刘晨¹

扫码查看

作者信息

1. 河海大学理学院,江苏南京 211100
折叠

摘要

针对四旋翼无人机的悬停控制及轨迹跟踪问题,利用近端策略优化算法来控制四旋翼飞行器,通过强化学习训练神经网络,将状态直接映射到四个旋翼,是一种用于在未知动态参数和干扰下控制任何线性或非线性系统的技术.基于回报塑形技术(The reward shaping of RL),提出了一种新颖的奖励函数,相比传统的PID算法,可以使无人机飞行更迅速且平稳.实验表明,四旋翼无人机可以以高精度高平稳的性能从三维中的定点悬停及轨迹跟踪,精度高达97.2%;文中的位置控制器具有泛化性和鲁棒性.

Abstract

For the hover control and trajectory tracking problem of quadcopter drones,a proximal strategy optimi-zation algorithm is used to control the quadcopter aircraft.By training a neural network through reinforcement learn-ing,the state is directly mapped to the four rotors,which is a technology used to control any linear or nonlinear system under unknown dynamic parameters and disturbances.In addition,we also propose a new reward function based on re-ward shaping,which can make uav flight smoother and the algorithm converges faster.Experiments show that the quadrotor can hover and track from three-dimensional fixed point with high accuracy and high stability and high sta-bility,and its accuracy is greater than 97.2%;The position controller in this paper has generalization and robustness.

关键词

无人机/四旋翼/强化学习/位置控制/近端策略优化

Key words

UAV/Quadrotor/Reinforcement learning/Position control/Proximal policy optimization

引用本文复制引用

基金项目

国家自然科学基金(61773152)

出版年

2024

计算机仿真

中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD

影响因子：0.518

ISSN：1006-9348

参考文献量15

段落导航