基于DE-SARSA(TS)的跳频系统智能抗干扰决策算法

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：为了提高跳频通信系统在复杂电磁环境下的抗干扰性能,提出一种基于结合汤普森采样(Thompson Sampling)、Dyna模型和期望SARSA学习(Expected Sarsa)的智能抗干扰决策算法.在期望SARSA学习中,引入Dyna模型,将模型学习与强化学习结合,提升了算法收敛速度和稳态性能;采用汤普森采样和Tanh函数改进动作选择机制,提高了算法对环境的探索和利用;以时隙对应的干扰环境为状态,以跳频速率、信号瞬时带宽、频率序列等为动作构造状态动作空间,设计了相应的跳频系统模型和奖励函数.在高斯白噪声、窄带干扰、宽带干扰和扫频干扰并存的复杂干扰环境中的仿真结果表明,此算法兼顾了对环境的探索与利用,比对比算法有更快的收敛速度和更强的抗干扰能力.

外文标题：Intelligent Anti-jamming Decision Algorithm of Frequency Hopping System Based on DE-SARSA(TS)

外文摘要：To increase the anti-jamming performance of frequency hopping communication system in complex electromagnetic environment,an intelligent anti-jamming decision-making algorithm based on Thompson sampling,Dyna model and expected SARSA learning is proposed.In the expected SARSA learning,Dyna model is applied,and then the convergence speed and steady performance are improved because the reinforcement learning is combined with the model learning.The action selection strategy is further improved by using Thompson sampling algorithm,and Tanh function,which enhances the method's exploration and utilization of the environment.The interference environment corresponding to the time slot is set as the state,and the frequency hopping rate,signal instantaneous bandwidth,frequency sequence and source power are set as actions for constructing state action space,and finally the corresponding frequency hopping system model and reward function are designed.In the complex interference environment where Gaussian white noise,narrowband interference,broadband interference and frequency sweep interference coexist,the simulation results show that this algorithm can balance the both exploration and utilization of the environment and achieves faster convergence speed and stronger anti-interference ability than the compared algorithms.

外文关键词：

complex electromagnetic environmentfrequency hopping systemexpect SARSA learningThompson samplingDyna model

作者：

袁泽、赵知劲

展开 >

作者单位：

杭州电子科技大学通信工程学院,浙江杭州 310018

中国电子科技集团第36研究所通信系统信息控制技术国家级重点实验室,浙江嘉兴 314001

关键词：

复杂电磁环境跳频系统期望SARSA学习汤普森采样 Dyna模型

基金：

国家自然科学基金资助项目

项目编号：

U19B2016

出版年：

2024

DOI：

10.13954/j.cnki.hdu.2024.01.002

杭州电子科技大学学报

杭州电子科技大学

杭州电子科技大学学报

影响因子：0.277

ISSN：1001-9146

年,卷(期)：2024.44(1)

参考文献量16