Robust observer-based deep reinforcement learning for attitude stabilization of vertical takeoff and landing vehicle
A robust observer-based proximal policy optimization(ROB-PPO)control method,which combines a robust observer and a proximal policy optimization in the deep reinforcement learning algorithm,is studied for the attitude stabilization problem of vertical takeoff and landing vehicles under the consideration of elastic vibration and model uncertainty disturbance.The method designs the robust observer to reconstruct the carrier attitude information disturbed by elastic vibration,composes the environment of the robust observer and the carrier dynamics model,and takes the reconstructed attitude obtained by the robust observer as the state of the deep reinforcement learning algorithm,so that the deep reinforcement learning intelligent body continuously interacts with it,thus training the intelligent body to control the carrier attitude stabilization.The simulation results show that the studied ROB-PPO algorithm is more robust and converges faster than the adaptive fuzzy proportional-integral-derivative(PID)algorithm commonly used today.Finally,the effectiveness of the proposed algorithm is verified on a self-developed vertical takeoff and landing vehicle.
vertical takeoff and landing vehicleattitude controlrobust observerdeep reinforcement learning