基于DE-SARSA(TS)的跳频系统智能抗干扰决策算法
Intelligent Anti-jamming Decision Algorithm of Frequency Hopping System Based on DE-SARSA(TS)
袁泽 1赵知劲2
作者信息
- 1. 杭州电子科技大学通信工程学院,浙江杭州 310018
- 2. 杭州电子科技大学通信工程学院,浙江杭州 310018;中国电子科技集团第36研究所通信系统信息控制技术国家级重点实验室,浙江 嘉兴 314001
- 折叠
摘要
为了提高跳频通信系统在复杂电磁环境下的抗干扰性能,提出一种基于结合汤普森采样(Thompson Sampling)、Dyna模型和期望SARSA学习(Expected Sarsa)的智能抗干扰决策算法.在期望SARSA学习中,引入Dyna模型,将模型学习与强化学习结合,提升了算法收敛速度和稳态性能;采用汤普森采样和Tanh函数改进动作选择机制,提高了算法对环境的探索和利用;以时隙对应的干扰环境为状态,以跳频速率、信号瞬时带宽、频率序列等为动作构造状态动作空间,设计了相应的跳频系统模型和奖励函数.在高斯白噪声、窄带干扰、宽带干扰和扫频干扰并存的复杂干扰环境中的仿真结果表明,此算法兼顾了对环境的探索与利用,比对比算法有更快的收敛速度和更强的抗干扰能力.
Abstract
To increase the anti-jamming performance of frequency hopping communication system in complex electromagnetic environment,an intelligent anti-jamming decision-making algorithm based on Thompson sampling,Dyna model and expected SARSA learning is proposed.In the expected SARSA learning,Dyna model is applied,and then the convergence speed and steady performance are improved because the reinforcement learning is combined with the model learning.The action selection strategy is further improved by using Thompson sampling algorithm,and Tanh function,which enhances the method's exploration and utilization of the environment.The interference environment corresponding to the time slot is set as the state,and the frequency hopping rate,signal instantaneous bandwidth,frequency sequence and source power are set as actions for constructing state action space,and finally the corresponding frequency hopping system model and reward function are designed.In the complex interference environment where Gaussian white noise,narrowband interference,broadband interference and frequency sweep interference coexist,the simulation results show that this algorithm can balance the both exploration and utilization of the environment and achieves faster convergence speed and stronger anti-interference ability than the compared algorithms.
关键词
复杂电磁环境/跳频系统/期望SARSA学习/汤普森采样/Dyna模型Key words
complex electromagnetic environment/frequency hopping system/expect SARSA learning/Thompson sampling/Dyna model引用本文复制引用
基金项目
国家自然科学基金资助项目(U19B2016)
出版年
2024