For the multi-surface unmanned boat(USV)mission planning problem,a deep reinforce-ment learning mission planning method based on proximal policy optimization is proposed.Taking the task of striking targets in enemy ports by a swarm of unmanned boats as the research object,the task decision-making problem is abstracted into a reasonable and effective Markovian decision-making pro-cess,a proximal policy optimization(PPO)algorithm intelligent planning model is established,and by introducing such policy training techniques as advantage normalization,reward scaling,and policy entropy,etc.the learning performance and generalization ability of the PPO algorithm model are improved.Finally,the simulation results show that our UAV swarm based on PPO algorithm can effec-tively cooperate against enemy targets,the effectiveness of the PPO algorithm model proposed in this paper in the mission decision making is proved.
deep reinforcement learningMarkov decision-making processesunmanned boat swarmsmission planning