Multi-agent cooperative electronic countermeasure method based on reinforcement learning
Traditional electronic warfare is gradually evolving into intelligent electronic warfare that integrates artificial intelligence technology.In view of the problem that multi-agent reinforcement learning algorithm is not easy to converge in complex and high-dimensional state action space,a multi-agent dual adversarial strategy gradient algorithm based on preferential experience playback is proposed.The algorithm introduces a preferential experience playback mechanism,and presents a counter Critic network and a dual Critic network to balance the relationship between action and value and to reduce the uncertainty of a single Critic network.The simulation results show that compared with other reinforcement learning algorithms,the PerMaD4 algorithm has better convergence effect and the task completion degree is increased by 8.9%in the same simulation scene.