基于样本信息熵辅助的深度强化学习抗干扰策略
Deep reinforcement learning-empowered anti-jamming strategy aided by sample information entropy
李刚 1吴麒 1王翔 1罗皓 1李良鸿 2景小荣 2陈前斌2
作者信息
- 1. 中国西南电子技术研究所,四川 成都 610036
- 2. 重庆邮电大学通信与信息工程学院,重庆 400065
- 折叠
摘要
针对深度强化学习驱动的智能化干扰,提出了一种基于样本信息熵辅助的通信抗干扰策略.首先,基于神经网络对抗干扰策略网络和熵预测网络进行设计;接着,利用短时傅里叶变换对接收信号处理所形成的频谱瀑布图作为样本,对抗干扰策略网络和信息熵预测网络进行训练;之后,利用信息熵预测网络对抗干扰策略网络的训练样本进行精细化筛选,以提高训练样本的质量,最终提高抗干扰策略的在线决策能力和泛化性能.仿真结果表明,在干扰方干扰策略更新频率不超过通信方40倍且最大干扰通道数为3的极端条件下,基于样本信息熵辅助的通信抗干扰策略仍可取得至少61%的成功率;同时,与其他几种对比抗干扰策略相比,所提通信抗干扰策略具有更快的收敛速度.
Abstract
For the deep reinforcement learning(DRL)-empowered intelligent jamming,an anti-jamming strategy aided by sample information entropy was proposed.Firstly,the anti-jamming strategy network and entropy prediction network were designed based on neural networks.Then,the anti-jamming strategy network and entropy prediction network were trained with the samples of the spectrum waterfall,which were formed by performing the short-time Fourier transform to the received signals.The information entropy prediction network was utilized for fine-grained selection of training samples of the anti-jamming strategy network to improve the quality of training samples,thereby enhancing the ultimate online decision-making capability and generalization performance of the anti-jamming strategy.The simulation results in-dicate that under the extreme condition where the jamming strategy update frequency does not exceed forty times that of the communication anti-jamming strategy and the maximum number of jamming channels is 3,the proposed anti-jamming strategy,aided by sample information entropy,can still achieve a success rate of at least 61%.Moreover,com-pared to several other anti-jamming strategies,the proposed strategy demonstrates faster convergence.
关键词
抗干扰/深度强化学习/样本信息熵/智能干扰Key words
anti-jamming/deep reinforcement learning/sample information entropy/intelligent jamming引用本文复制引用
基金项目
国家自然科学基金资助项目(U23A20279)
中电天奥创新理论技术群基金资助项目(2022-1193-04-04)
出版年
2024