复杂环境下的飞行器在线航路规划决策方法

Online route planning decision-making method of aircraft in complex environment

杨志鹏 ¹陈子浩 ¹曾长 ¹林松 ¹毛金娣 ¹张凯¹

扫码查看

作者信息

1. 湖北航天技术研究院总体设计所,湖北武汉 430040
折叠

摘要

针对飞行器在线航路规划问题,提出一种基于深度强化学习(deep reinforcement learning,DRL)的飞行器在线自主决策方法.首先对飞行器运动模型、探测模型进行了说明,然后采用DRL深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法,对飞行器飞行控制策略模型框架进行了构建.在此基础上,提出了一种基于课程学习(curriculum learning,CL)的CL-DDPG算法,将在线航路规划任务进行分解,引导飞行器进行目标靠近、威胁规避、航路寻优策略学习,并设置相应的高斯噪声帮助飞行器对策略进行探索和优化,实现了复杂场景下的飞行器自适应学习和决策控制.仿真实验证明,CL-DDPG算法能够有效提升模型的训练效率,算法模型任务成功率更高,具有优秀的泛化性和鲁棒性,能够更好地应用于复杂动态环境下的在线航路规划任务中.

Abstract

Aiming at the problem of online route planning for aircraft,an online autonomous decision-making method for aircraft based on deep reinforcement learning(DRL)is proposed.Firstly,the maneuvering model and detection model of the aircraft are explained,and then the deep deterministic policy gradient(DDPG)algorithm of DRL is employed to construct the frame of the aircraft policy model.On this basis,a curriculum learning(CL)-DDPG algorithm based on CL is proposed,which decomposes the online route planning task,guides the aircraft to learn the strategies of target approach,threat avoidance,and air route optimization.The corresponding Gaussian noises are set to help the aircraft explore and optimize the strategy.And,the adaptive learning and decision-making control of the aircraft in complex scenarios are realized.Simulation experiments show that the CL-DDPG algorithm can effectively improve the training efficiency of the model.The algorithm model has higher task success rate,excellent generalization and robustness,and can be better applied to online route planning tasks in complex dynamic environments.

关键词

在线航路规划/深度强化学习/自主决策/课程学习/威胁规避

Key words

online route planning/deep reinforcement learning(DRL)/autonomous decision-making/curriculum learning/threat avoidance

引用本文复制引用

基金项目

国家自然科学基金(62003267)

出版年

2024

系统工程与电子技术

中国航天科工防御技术研究院中国宇航学会中国系统工程学会

系统工程与电子技术

CSTPCDCSCD北大核心

影响因子：0.847

ISSN：1001-506X

参考文献量30

段落导航