基于PPO算法的四旋翼无人机位置控制
Position Control of Quadrotor Based on PPO Algorithm
杨宗月 1刘磊 1刘晨1
作者信息
- 1. 河海大学理学院,江苏 南京 211100
- 折叠
摘要
针对四旋翼无人机的悬停控制及轨迹跟踪问题,利用近端策略优化算法来控制四旋翼飞行器,通过强化学习训练神经网络,将状态直接映射到四个旋翼,是一种用于在未知动态参数和干扰下控制任何线性或非线性系统的技术.基于回报塑形技术(The reward shaping of RL),提出了一种新颖的奖励函数,相比传统的PID算法,可以使无人机飞行更迅速且平稳.实验表明,四旋翼无人机可以以高精度高平稳的性能从三维中的定点悬停及轨迹跟踪,精度高达97.2%;文中的位置控制器具有泛化性和鲁棒性.
Abstract
For the hover control and trajectory tracking problem of quadcopter drones,a proximal strategy optimi-zation algorithm is used to control the quadcopter aircraft.By training a neural network through reinforcement learn-ing,the state is directly mapped to the four rotors,which is a technology used to control any linear or nonlinear system under unknown dynamic parameters and disturbances.In addition,we also propose a new reward function based on re-ward shaping,which can make uav flight smoother and the algorithm converges faster.Experiments show that the quadrotor can hover and track from three-dimensional fixed point with high accuracy and high stability and high sta-bility,and its accuracy is greater than 97.2%;The position controller in this paper has generalization and robustness.
关键词
无人机/四旋翼/强化学习/位置控制/近端策略优化Key words
UAV/Quadrotor/Reinforcement learning/Position control/Proximal policy optimization引用本文复制引用
出版年
2024