首页|基于近端策略优化算法和Mask-TIT网络的多功能雷达干扰决策方法

基于近端策略优化算法和Mask-TIT网络的多功能雷达干扰决策方法

扫码查看
为应对愈加智能的多功能雷达给对抗方带来的挑战,本文提出一种基于近端策略优化(Proximal policy optimization,PPO)算法和Mask-TIT(Mask-Transformer in Transformer)网络的干扰决策方法.首先,从一种现实场景出发,将干扰机与雷达的对抗场景建模为部分可观察马尔可夫决策过程(Partially observable Markov decision process,POMDP),根据雷达工作原理设计了新的状态转移函数和奖励函数,并根据多功能雷达层级模型设计了观测空间.其次,利用Transformer对序列数据的表征能力和雷达干扰样式的特点设计了一种Mask-TIT网络结构,用于构建更强大的Actor-Critic网络架构.最后,使用近端策略优化算法进行优化学习.实验结果表明,该算法较现有方法收敛所需交互数据平均减少25.6%,并且收敛后的方差显著降低.
A Multi-functional Radar Jamming Decision Method Based on Proximal Policy Op-timization Algorithm and Mask-TIT Network
To cope with the challenges brought by increasingly intelligent multifunctional radars to the opposing side,this paper proposes an jamming decision-making method based on the proximal policy optimization(PPO)algorithm and the Mask-Transformer in Transformer(Mask-TIT)network.Firstly,starting from a realistic scenario,the adversarial scene between the jammer and the radar is modeled as a partially observable Markov decision process(POMDP).A new state transition function and reward function are designed based on the working principles of the radar,and the observation space is designed according to the hierarchy of the multifunctional radar model.Secondly,a Mask-TIT network structure is designed using the Transformer's representation capacity for sequence data and the characteristics of radar jamming patterns,which is used to build a more powerful Actor-Critic network architecture.Finally,the PPO algorithm is used for optimization learning.Experimental results show that compared with existing methods,the proposed algorithm reduces the average amount of interactive data required for convergence by 25.6%,and the variance after convergence is significantly reduced.

radar jamming decisionpartially observable Markov decision process(POMDP)reinforcement learningTransformerproximal policy optimization(PPO)

娄雨璇、孙闽红、尹帅

展开 >

杭州电子科技大学通信工程学院,杭州 310018

雷达干扰决策 部分可观察马尔可夫决策过程 强化学习 Transformer 近端策略优化

2024

数据采集与处理
中国电子学会 中国仪器仪表学会信号处理学会 中国仪器仪表学会中国物理学会微弱信号检测学会 南京航空航天大学

数据采集与处理

CSTPCD北大核心
影响因子:0.679
ISSN:1004-9037
年,卷(期):2024.39(6)