Research on PPO Algorithm Design for UAV Swarm Reconnaissance and Strike Scenarios
The problem of d ecision-making of UAV swarms is an important research direction of intelligent warfare.Taking the mission scenario of the constructed typical UAV swarm reconnaissance and strike as an example,the task allocation and motion planning of UAV swarm under the complex and uncertain conditions are studied.In order to solve this problem,the complexity of mission decision-making and the uncertainty of battlefield environment are first elaborated from the perspective of parametric design of battlefield environment model and typical swarm reconnaissance and strike mission.Then,a state space,reward function,action space and strategy network with strong versatility is designed.First,types of features are designed and processed as state space in order to capture multiple situation information.Multiple types of rewards closely related to the reconnaissance and strike task are designed at the same time.Moreover,the output of action strategy takes the form of subject-verb-object to better express the complex operations.The encoder-time series aggregation-attention mechanism-decoder structure is designed for the strategy network,which fully integrates the feature information and promotes the training effects.Then it is solved by Deep Reinforcement Learning(DRL)based on Proximal Policy Optimization(PPO).Finally,the feasibility and effectiveness of UAV swarm to realize reconnaissanc e and strike mission decision-making under the complex and uncertain conditions are verified through simulation environment experiments,meanwhile the intelligence of swarm task allocation and motion planning is demonstrated.
proximal policy optimization designtask assignmentmotion planningreconnaissance and strikecollaborative decision-making