Cooperative decision-making for heterogeneous UAV swarm confrontation based on self-play reinforcement learning
With the development of unmanned aerial vehicle(UAV)technology,UAV swarm confrontation has become a research hotspot at home and abroad.The existing decision-making algorithms mainly focus on the scenario of homogeneous UAV swarm confrontation.When facing complex adversarial environments,these methods encounter challenges,such as difficulty in designing reward functions and the inability to meet real-time decision-making requirements.To this end,this paper focuses on the real-time maneuver decision-making problem in heterogeneous UAV swarm combat.First,we construct an adversarial simulation environment for a leader-follower heterogeneous UAV swarm,where the leader and follower UAVs possess different maneuvering and attacking capabilities,and their outcomes have varying impacts on victory.Second,we propose a distributed UAV swarm cooperative maneuver control algorithm based on multi-agent reinforcement learning,and design a training and optimization approach combining curriculum learning and self-play.By designing simple sparse rewards combined with curriculum learning,we can get cooperative maneuver strategies for the heterogeneous UAV swarm.Introducing self-play adversarial mode makes opponents'UAV strategies more targeted,enhancing the intensity of combat and further optimizing maneuver strategies to better align with practical requirements.Last,the effectiveness and scalability of our proposed methods are validated through simulations.