基于相对熵逆强化学习的飞行冲突解脱方法

Flight conflict resolution method based on relative entropy inverse reinforcement learning

扫码查看

原文链接

NETL
NSTL
维普
万方数据

中文摘要：针对航路上的飞行冲突解脱问题,提出了基于相对熵逆强化学习的飞行冲突解脱方法.首先基于相对熵的逆强化学习算法从历史飞行轨迹数据中学习隐含的管制员先验知识,并以奖励函数的形式进行量化表达.然后,将奖励函数引入基于深度强化学习的冲突解脱模型,以指引训练模型不断向与管制员解脱方案相似的方向更新.试验结果表明,解脱模型能够学习管制先验知识,且在测试集中冲突解脱率超过73％.研究对于减少管制员工作负荷和提升空中交通管制安全性有借鉴价值.

外文摘要：The primary objective of air traffic management is to ensure the safety of aircraft flights.Flight conflicts can lead to hazardous approaches or even collisions,resulting in severe consequences.Therefore,studying auxiliary tools to assist controllers in resolving flight conflicts becomes essential.This article aims to enhance the personalization level of regulatory decision-making tools and improve controllers'acceptance of conflict resolution solutions provided by these tools.Firstly,this article adopts an inverse reinforcement learning method based on relative entropy to extract implicit controller instruction strategies from aircraft flight trajectory data and represent them as reward functions.The flight conflict resolution problem is then modeled using the Markov decision process,and the deep reinforcement learning method(DQN algorithm)is employed to train the model guided by the aforementioned reward function.The objective is to enhance the success rate of the resolution models and the degree of strategy personalization.Additionally,the article introduces analysis indicators from two perspectives:safety and applicability.Finally,a simulation system based on the Base of Aircraft Data(BADA)database is utilized to generate 5 000 flight conflict scenarios.Out of these,4 000 scenarios are used for model training,and the remaining 1 000 are employed to verify the effectiveness of the proposed method.Experimental results demonstrate that,under the guidance of a reward function incorporating controller strategies,the resolution model consistently improves the success rate of flight conflict scenarios and the similarity to controller strategies.During the testing phase,the successful resolution rate exceeds 70％.This result validates that the inverse reinforcement learning method based on relative entropy effectively learns the empirical knowledge of controllers,thereby enhancing the efficiency and personalization level of the resolution models.These methods present a novel approach to studying and improving the level of personification in control schemes,which has practical significance in enhancing the efficiency of air traffic operations and ensuring airspace safety.

外文关键词：

safety engineeringair traffic controlflight conflict resolutionreverse reinforcement learningdeep reinforcement learning

作者：

隋东、董金涛

展开 >

作者单位：

南京航空航天大学民航学院,南京 211106

关键词：

安全工程空中交通管制飞行冲突解脱逆强化学习深度强化学习

基金：

中国民用航空局资助项目南京航空航天大学科研与实践创新计划项目

项目编号：

[2022]125号xcxjh20220710

出版年：

2024

DOI：

10.13637/j.issn.1009-6094.2023.0406

安全与环境学报

北京理工大学中国环境科学学会中国职业安全健康协会

安全与环境学报

CSTPCD北大核心

影响因子：0.943

ISSN：1009-6094

年,卷(期)：2024.24(3)

参考文献量21