首页|基于奖励塑造强化学习的智能导弹突防策略规划方法

基于奖励塑造强化学习的智能导弹突防策略规划方法

扫码查看
面向未来海上分布式作战需求,以对抗环境下分布式作战场景中智能导弹齐射突防水面舰艇为背景,首先,分析了导弹突防策略规划问题;其次,设计了基于奖励塑造强化学习的智能导弹突防策略规划方法;然后,在墨子联合作战推演系统上构建导弹突防舰艇的作战想定,实验结果表明本文方法的智能导弹突防打击成功率为79%,验证了基于奖励塑造强化学习方法的有效性;最后,经过复盘发现,奖励塑造实验涌现出智能导弹的4类对舰打击突防策略:集中迂回攻击、分散突防多向攻击、分组延时攻击、巡弋探测指引攻击。
Reward shaping based reinforcement learning for intelligent missile penetration attack strategy planning
Facing the future requirements of distributed warfare at sea,the strategic planning of missile penetration is firstly analyzed based on the background of intelligent missile salvo penetration against surface ships in distributed war-fare scenario.Secondly,a strategic planning method of intelligent missile penetration based on reward-shaping reinforce-ment learning is designed by using multi-class reward function.Then,the operation scenario of the missile penetration ship is constructed on the Mozi joint operation simulation system.The comparison experiment shows that the success rate of the intelligent missile penetration attack controlled by the model learned by the reward molding method is 79%,which verifies the effectiveness of the reward-based reinforcement learning method.Finally,after action review,it is found that there are emerging four kinds of penetration strategies of intelligent missiles in the reward shaping experiment,including concentrated and roundabout attack,scattered penetration multi-direction attack,group delay attack and cruise detection guide attack.

penetrating anti-shipintelligent missilesalvo penetrationtask planningwargaming

罗俊仁、刘果、苏炯铭、张万鹏、陈璟

展开 >

国防科技大学智能科学学院,湖南 长沙 410073

穿透性反舰 智能导弹 齐射突防 任务规划 兵棋推演

国家自然科学基金项目湖南省研究生创新项目

61806212CX20210011

2024

智能科学与技术学报

智能科学与技术学报

CSTPCD
ISSN:
年,卷(期):2024.6(2)