For the multi-surface unmanned boat(USV)mission planning problem,a deep reinforce-ment learning mission planning method based on proximal policy optimization is proposed.Taking the task of striking targets in enemy ports by a swarm of unmanned boats as the research object,the task decision-making problem is abstracted into a reasonable and effective Markovian decision-making pro-cess,a proximal policy optimization(PPO)algorithm intelligent planning model is established,and by introducing such policy training techniques as advantage normalization,reward scaling,and policy entropy,etc.the learning performance and generalization ability of the PPO algorithm model are improved.Finally,the simulation results show that our UAV swarm based on PPO algorithm can effec-tively cooperate against enemy targets,the effectiveness of the PPO algorithm model proposed in this paper in the mission decision making is proved.
关键词
深度强化学习/马尔可夫决策过程/无人艇群/任务规划
Key words
deep reinforcement learning/Markov decision-making processes/unmanned boat swarms/mission planning