首页|基于D3QN的火力方案优选方法

基于D3QN的火力方案优选方法

扫码查看
针对在多类弹药协同攻击地面工事类目标任务中火力方案优选效率低的问题,提出一种基于双层决斗DQN(dueling double deep Q network,D3QN)的火力方案优选方法。该方法将打击过程建模为马尔科夫决策过程(Markov decision processes,MDP),设计其状态空间和动作空间,设计综合奖励函数激励火力方案生成策略优化,使智能体通过强化学习框架对策略进行自主训练。仿真实验结果表明,该方法对地面工事类目标的火力方案进行决策,相较于传统启发式智能算法能够获得较优的火力方案,其计算效率和结果的稳定性相较于传统深度强化学习算法具有更明显的优势。
Optimization Selection Method of Fire Plan Based on D3QN
To address the problem of inefficient fire plan optimization in the task of coordinated attack on ground fortification-type targets by multiple types of munitions,a fire plan optimization method based on the Dueling Double Deep Q Network(D3QN)is proposed.The method models the striking process as Markov Decision Processes(MDPs),firstly its state space and action space are designed,then a comprehensive reward function is designed to stimulate the optimization of the fire plan generation strategy,and finally the intelligent body is enabled to train the strategy autonomously through a reinforcement learning framework.The simulation experiment results show that the method can achieve more optimal fire solutions for ground fortification type targets than that of the traditional heuristic intelligence algorithms,and its computational efficiency and stability of results are more obvi-ously advantageous than that of the traditional deep reinforcement learning algorithms.

deep reinforcement learningdeep Q networkD3QNcombinatorial optimization prob-lemoptimization of fire plan

佘维、岳瀚、田钊、孔德锋

展开 >

郑州大学网络空间安全学院,郑州 450000

嵩山实验室,郑州 450000

郑州市区块链与数据智能重点实验室,郑州 450000

军事科学院国防工程研究院工程防护研究所,河南 洛阳 471023

展开 >

深度强化学习 深度Q网络 D3QN 组合优化 火力方案优选

嵩山实验室预研项目河南省重点研发与推广专项基金资助项目

YYYY022022003212102310039

2024

火力与指挥控制
火力与指挥控制研究会,火力与指挥控制专业情报网

火力与指挥控制

CSTPCD北大核心
影响因子:0.312
ISSN:1002-0640
年,卷(期):2024.49(8)