首页|复杂环境下的飞行器在线航路规划决策方法

复杂环境下的飞行器在线航路规划决策方法

扫码查看
针对飞行器在线航路规划问题,提出一种基于深度强化学习(deep reinforcement learning,DRL)的飞行器在线自主决策方法.首先对飞行器运动模型、探测模型进行了说明,然后采用DRL深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法,对飞行器飞行控制策略模型框架进行了构建.在此基础上,提出了一种基于课程学习(curriculum learning,CL)的CL-DDPG算法,将在线航路规划任务进行分解,引导飞行器进行目标靠近、威胁规避、航路寻优策略学习,并设置相应的高斯噪声帮助飞行器对策略进行探索和优化,实现了复杂场景下的飞行器自适应学习和决策控制.仿真实验证明,CL-DDPG算法能够有效提升模型的训练效率,算法模型任务成功率更高,具有优秀的泛化性和鲁棒性,能够更好地应用于复杂动态环境下的在线航路规划任务中.
Online route planning decision-making method of aircraft in complex environment
Aiming at the problem of online route planning for aircraft,an online autonomous decision-making method for aircraft based on deep reinforcement learning(DRL)is proposed.Firstly,the maneuvering model and detection model of the aircraft are explained,and then the deep deterministic policy gradient(DDPG)algorithm of DRL is employed to construct the frame of the aircraft policy model.On this basis,a curriculum learning(CL)-DDPG algorithm based on CL is proposed,which decomposes the online route planning task,guides the aircraft to learn the strategies of target approach,threat avoidance,and air route optimization.The corresponding Gaussian noises are set to help the aircraft explore and optimize the strategy.And,the adaptive learning and decision-making control of the aircraft in complex scenarios are realized.Simulation experiments show that the CL-DDPG algorithm can effectively improve the training efficiency of the model.The algorithm model has higher task success rate,excellent generalization and robustness,and can be better applied to online route planning tasks in complex dynamic environments.

online route planningdeep reinforcement learning(DRL)autonomous decision-makingcurriculum learningthreat avoidance

杨志鹏、陈子浩、曾长、林松、毛金娣、张凯

展开 >

湖北航天技术研究院总体设计所,湖北武汉 430040

在线航路规划 深度强化学习 自主决策 课程学习 威胁规避

国家自然科学基金

62003267

2024

系统工程与电子技术
中国航天科工防御技术研究院 中国宇航学会 中国系统工程学会

系统工程与电子技术

CSTPCD北大核心
影响因子:0.847
ISSN:1001-506X
年,卷(期):2024.46(9)