基于相似性样本生成的深度强化学习快速抗干扰算法

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：为提高基于深度强化学习的通信抗干扰算法的学习效率,以更快适应未知干扰环境,提出一种基于相似性样本生成的深度强化学习快速抗干扰算法.该算法将基于互模拟关系的状态-动作对相似性度量与基于深度Q网络的抗干扰算法相结合,能在未知动态干扰环境下快速学习有效的多域抗干扰策略.算法在完成每步传输动作时,首先利用深度Q网络抗干扰算法与环境交互,获得实际的状态-动作对.然后,基于互模拟关系生成与之相似的状态-动作集,从而利用相似状态-动作集生成模拟的训练样本.通过上述操作,算法每步迭代能获得大量训练样本,可显著加快抗干扰算法的训练进程和收敛速度.仿真结果表明,在多路扫频干扰和智能阻塞干扰下,所提算法收敛速度快,且收敛后的归一化吞吐量均显著优于常规深度Q网络算法、Q学习算法以及基于知识复用的改进Q学习算法.

外文标题：Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation

外文摘要：To improve the learning efficiency of anti-jamming algorithms based on deep reinforcement learning and en-able them to adapt more quickly to unknown jamming environments,a fast deep reinforcement learning anti-jamming al-gorithm based on similar sample generation was proposed.By combining the similarity measurement of state-action pairs,derived from bisimulation,with an anti-jamming algorithm grounded in the deep Q-network,this algorithm was able to quickly learn effective multi-domain anti-jamming strategies in unknown,dynamic jamming environments.Spe-cifically,once a transmission action was completed,the proposed algorithm first interacted with the environment using the deep Q-network to acquire actual state-action pairs.Then it generated a set of similar state-action pairs based on bi-simulation,employing these similar state-action pairs to produce simulated training samples.Through these operations,the algorithm was able to acquire a large number of training samples at each iteration step,thereby significantly accelerat-ing the training process and convergence speed.Simulation results show that under comb sweep jamming and intelligent blocking jamming,the proposed algorithm exhibits rapid convergence speed,and its normalized throughput after conver-gence significantly superior to the conventional deep Q-network algorithm,the Q-learning algorithm,and the improved Q-learning algorithm based on knowledge reuse.

外文关键词：

communication anti-jammingdeep reinforcement learningfast anti-jammingreliable communication

作者：

周权、牛英滔

展开 >

作者单位：

国防科技大学第六十三研究所,江苏南京 210007

陆军工程大学通信工程学院,江苏南京 210007

关键词：

通信抗干扰深度强化学习快速抗干扰可靠通信

基金：

国家自然科学基金资助项目

项目编号：

62371461

出版年：

2024

DOI：

10.11959/j.issn.1000-436x.2024131

通信学报

中国通信学会

通信学报

CSTPCD北大核心

影响因子：1.265

ISSN：1000-436X

年,卷(期)：2024.45(7)

参考文献量2