Abstract
Autonomous umanned aerial vehicle(UAV)manipula-tion is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battle-field.A large amount of research has been devoted to improving the autonomous decision-making ability of UAV in an interactive environment,where finding the optimal maneuvering decision-making policy became one of the key issues for enabling the intelligence of UAV.In this paper,we propose a maneuvering decision-making algorithm for autonomous air-delivery based on deep reinforcement learning under the guidance of expert expe-rience.Specifically,we refine the guidance towards area and guidance towards specific point tasks for the air-delivery pro-cess based on the traditional air-to-surface fire control methods.Moreover,we construct the UAV maneuvering decision-making model based on Markov decision processes(MDPs).Specifi-cally,we present a reward shaping method for the guidance towards area and guidance towards specific point tasks using potential-based function and expert-guided advice.The pro-posed algorithm could accelerate the convergence of the maneuvering decision-making policy and increase the stability of the policy in terms of the output during the later stage of training process.The effectiveness of the proposed maneuvering deci-sion-making policy is illustrated by the curves of training para-meters and extensive experimental results for testing the trained policy.
基金项目
Key Research and Development Program of Shaanxi(2022GXLH-02-09)
航空科学基金(20200051053001)
陕西省自然科学基金(2020JM-147)