首页|一种基于深度强化学习的频率捷变雷达智能频点决策方法

一种基于深度强化学习的频率捷变雷达智能频点决策方法

扫码查看
自卫式干扰机发射的瞄准干扰使多种基于信号处理的被动干扰抑制方法失效,对现代雷达产生了严重威胁,频率捷变作为一种主动对抗方式为对抗瞄准干扰提供了可能.针对传统随机跳频抗干扰性能不稳定、频点选取自由度有限、策略学习所需时间长等问题,该文面向频率捷变雷达,提出了一种快速自适应跳频策略学习方法.首先设计了一种频点可重复选取的频率捷变波形,为最优解提供了更多选择.在此基础上,通过利用雷达与干扰机持续对抗收集到的数据,基于深度强化学习的探索与反馈机制,不断优化频点选取策略.具体来说,通过将上一时刻雷达频点及当前时刻感知到的干扰频点作为强化学习输入,神经网络智能选取当前时刻各子脉冲频点,并根据目标检测结果以及信干噪比两方面评价抗干扰效能,从而优化策略直至最优.从提高最优策略收敛速度出发,设计的输入状态不依赖历史时间步、引入贪婪策略平衡搜索-利用机制、配合信干噪比提高奖励差异.多组仿真实验结果表明,所提方法能够收敛到最优策略且具备较高的收敛效率.
An Intelligent Frequency Decision Method for a Frequency Agile Radar Based on Deep Reinforcement Learning
The aiming jamming emitted by self-defense jammers renders various passive anti-jamming measures based on signal processing ineffective,posing severe threats to modern radars.Frequency agility,as an active countermeasure,enables the resistance of aiming jamming.In response to issues such as the unstable anti-jamming performance of traditional random frequency hopping,limited freedom in frequency selection,and the long time required for strategic learning,the paper proposes a fast-adaptive frequency-hopping strategy for a frequency agile radar.First,a frequency agile waveform with repeatable frequency selection is designed,providing more choices for an optimal solution.Accordingly,using the data collected through continuous confrontation between a radar and a jammer,and the exploration and feedback mechanism of deep reinforcement learning,a frequency-selection strategy is continuously optimized.Specifically,considering radar frequency from the previous time and jamming frequency perceived at the current time as reinforcement learning inputs,the neural network intelligently selects each subpulse frequency at the current time and optimizes the strategy until it is optimal based on the anti-jamming effectiveness evaluated by the target detection result and Signal-to-Jamming-plus-Noise Ratio(SJNR).To improve the convergence speed of the optimal strategy,the designed input state is independent of the historical time step,the introduced greedy strategy balances the search-utilization mechanism,and the SJNR differentiates rewards more.Multiple sets of simulations show that the proposed method can converge to the optimal strategy and has high convergence efficiency.

Frequency agile radarAnti-jammingWaveform designAiming jammingDeep Q-Network(DQN)

张嘉翔、张凯翔、梁振楠、陈新亮、刘泉华

展开 >

北京理工大学信息与电子学院雷达技术研究所 北京 100081

北京理工大学长三角研究院(嘉兴) 嘉兴 314000

北京理工大学重庆创新中心 重庆 401120

卫星导航电子信息技术教育部重点实验室(北京理工大学) 北京 100081

展开 >

频率捷变雷达 抗干扰 波形设计 瞄准干扰 深度Q网络

国家自然科学基金

62201048

2024

雷达学报
中国科学院电子学研究所 中国雷达行业协会

雷达学报

CSTPCD北大核心EI
影响因子:0.667
ISSN:2095-283X
年,卷(期):2024.13(1)
  • 24