宇航学报2024,Vol.45Issue(9) :1429-1444.DOI:10.3873/j.issn.1000-1328.2024.09.009

多禁飞区在线遭遇的自主规避再入制导方法

Autonomous Entry Guidance Method for Online Encounters with Multiple No-fly Zones

王浩凝 郭杰 张宝超 王子瑶 唐胜景 李响
宇航学报2024,Vol.45Issue(9) :1429-1444.DOI:10.3873/j.issn.1000-1328.2024.09.009

多禁飞区在线遭遇的自主规避再入制导方法

Autonomous Entry Guidance Method for Online Encounters with Multiple No-fly Zones

王浩凝 1郭杰 1张宝超 1王子瑶 2唐胜景 1李响1
扫码查看

作者信息

  • 1. 北京理工大学宇航学院,北京 100081
  • 2. 北京宇航系统工程研究所,北京 100076
  • 折叠

摘要

针对高超声速滑翔飞行器再入过程中遭遇多个未知威胁的规避突防需求,提出了一种多禁飞区在线遭遇的自主规避再入制导方法.将多个在线遭遇的禁飞区连续规避问题抽象为序贯决策问题,设计了一种基于强化学习的解决方案以提高飞行器的自主规避能力.充分考虑强化学习智能体的泛化能力和训练效率,建立了禁飞区规避问题的马尔科夫决策过程.在此基础上,设计基于模糊控制策略的多智能体协调决策方法,为每一个在线遭遇的禁飞区分配航向决策智能体进行独立航向决策,根据实时环境评估各禁飞区的威胁程度并协调生成航向指令.理论分析和数值仿真表明,此方法能够使飞行器在满足终端约束和过程约束条件下,在多个在线遭遇的禁飞区场景中实现有效规避,具有良好的鲁棒性和泛化能力.

Abstract

Considering the need for avoiding multiple unknown threats during the entry process of hypersonic glide vehicles,an autonomous entry guidance method is proposed for online encountering with multiple no-fly zones.The problem of sequentially avoiding multiple no-fly zones encountered in flight is treated as a sequential decision-making problem.A solution based on reinforcement learning is designed to enhance the autonomous capability of the vehicle.The Markov decision process for the no-fly zone avoidance problem is formulated,taking into account both the generalization capability and training efficiency of the reinforcement learning agent.Furthermore,a multi-agent coordination decision-making method is developed using a fuzzy control strategy.This method assigns a heading decision-making agent to each online-detected no-fly zone,making independent heading decisions.The method conducts real-time environmental assessments to evaluate the threat level of each no-fly zone and coordinates the generation of heading commands.Theoretical analysis and numerical simulations demonstrate that the proposed method enables effective avoidance of multiple no-fly zones encountered in flight,satisfying both terminal and process constraints.The method exhibits robustness and generalization capabilities,showcasing its effectiveness in diverse scenarios.

关键词

再入制导/多禁飞区规避/强化学习/模糊控制/在线自主决策

Key words

Entry guidance/Multi-no-fly zone avoidance/Reinforcement learning/Fuzzy control/Online autonomous decision-making

引用本文复制引用

出版年

2024
宇航学报
中国宇航学会

宇航学报

CSTPCDCSCD北大核心
影响因子:0.887
ISSN:1000-1328
参考文献量39
段落导航相关论文