Stealthy data poisoning attack method on offline reinforcement learning in unmanned systems
Aiming at the limitations in effectiveness and stealth of existing offline reinforcement learning(RL)data poi-soning attacks,a critical time-step dynamic poisoning attack was proposed,perturbing important samples to achieve effi-cient and covert attacks.Temporal difference errors,identified through theoretical analysis as crucial for model learning,were used to guide poisoning target selection.A bi-objective optimization approach was introduced to minimize perturba-tion magnitude while maximizing the negative impact on performance.Experimental results show that with only a 1%poisoning rate,the method reduces agent performance by 84%,revealing the sensitivity and vulnerability of offline RL models in unmanned systems.