首页|UAV maneuvering decision-making algorithm based on deep reinforcement learning under the guidance of expert experience

UAV maneuvering decision-making algorithm based on deep reinforcement learning under the guidance of expert experience

扫码查看
Autonomous umanned aerial vehicle(UAV)manipula-tion is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battle-field.A large amount of research has been devoted to improving the autonomous decision-making ability of UAV in an interactive environment,where finding the optimal maneuvering decision-making policy became one of the key issues for enabling the intelligence of UAV.In this paper,we propose a maneuvering decision-making algorithm for autonomous air-delivery based on deep reinforcement learning under the guidance of expert expe-rience.Specifically,we refine the guidance towards area and guidance towards specific point tasks for the air-delivery pro-cess based on the traditional air-to-surface fire control methods.Moreover,we construct the UAV maneuvering decision-making model based on Markov decision processes(MDPs).Specifi-cally,we present a reward shaping method for the guidance towards area and guidance towards specific point tasks using potential-based function and expert-guided advice.The pro-posed algorithm could accelerate the convergence of the maneuvering decision-making policy and increase the stability of the policy in terms of the output during the later stage of training process.The effectiveness of the proposed maneuvering deci-sion-making policy is illustrated by the curves of training para-meters and extensive experimental results for testing the trained policy.

unmanned aerial vehicle(UAV)maneuvering deci-sion-makingautonomous air-deliverydeep reinforcement learn-ingreward shapingexpert experience

ZHAN Guang、ZHANG Kun、LI Ke、PIAO Haiyin

展开 >

School of Electronics and Information,Northwestern Polytechnical University,Xi'an 710072,China

Science and Technology on Electro-Optic Control Laboratory,Luoyang 471009,China

Key Research and Development Program of Shaanxi航空科学基金陕西省自然科学基金

2022GXLH-02-09202000510530012020JM-147

2024

系统工程与电子技术(英文版)
中国航天科工防御技术研究院 中国宇航学会 中国系统工程学会 中国系统仿真学会

系统工程与电子技术(英文版)

CSTPCD
影响因子:0.64
ISSN:1004-4132
年,卷(期):2024.35(3)
  • 1