首页|基于分层强化学习的多智能体博弈策略生成方法

基于分层强化学习的多智能体博弈策略生成方法

扫码查看
典型基于深度强化学习的多智能体对抗策略生成方法采用"分总"框架,各智能体基于部分可观测信息生成策略并进行决策,缺乏从整体角度生成对抗策略的能力,大大限制了决策能力.为了解决该问题,基于分层强化学习提出改进的多智能体博弈策略生成方法.基于分层强化学习构建观测信息到整体价值的决策映射,以最大化整体价值作为目标构建优化问题,并推导了策略优化过程,为后续框架结构和方法实现的设计提供了理论依据;基于决策映射与优化问题构建,采用神经网络设计了模型框架,详细阐述了顶层策略控制模型和个体策略执行模型;基于策略优化方法,给出详细训练流程和算法流程;采用星际争霸多智能体对抗(StarCraft Multi-Agent Challenge,SMAC)环境,与典型多智能体方法进行性能对比.实验结果表明,该方法能够有效生成对抗策略,控制异构多智能体战胜预设对手策略,相比典型多智能体强化学习方法性能提升明显.
Multi-agent Game Strategy Generation Method Based on Hierarchical Reinforcement Learning
In traditional multi-agent confrontation strategy generation method based on deep reinforcement learning,a"decentralized"framework is adopted,in which each agent generates strategies and makes decisions based on partial observable information,lacking the ability to generate confrontation strategy from the whole observable information and greatly limiting the decision-making ability.To address this disadvantage,an improved method for generating multi-agent game strategies based on hierarchical reinforcement learning is proposed.First,decision mapping from observation information to overall value is constructed based on hierarchical reinforcement learning,optimization problems are formulated with maximization of overall value as the objective,and the process of strategy optimization is derived,providing theoretical basis for the subsequent design of framework structure and method implementation.Then,based on the decision mapping and optimization problems,a model framework is designed using neural networks,and detailed explanations are provided for the top-level strategy control model and individual strategy execution model.Furthermore,detailed training processes and algorithm flows are presented based on strategy optimization method.Finally,the performance of the proposed method is compared with traditional multi-agent methods using StarCraft Multi-Agent Challenge(SMAC)environment.Experimental results demonstrate that the method effectively generates confrontation strategies,enabling heterogamous multi-agent systems to defeat preset opponent strategies,and the performance is significantly improved as compared to traditional multi-agent reinforcement learning method.

hierarchical reinforcement learningmulti-agent gamedeep neural network

畅鑫、李艳斌、刘东辉

展开 >

中国电子科技集团公司第五十四研究所,河北石家庄 050081

石家庄铁道大学管理学院,河北 石家庄 050043

石家庄铁道大学工程建设管理研究中心,河北 石家庄 050043

分层强化学习 多智能体博弈 深度神经网络

中国博士后科学基金国家自然科学基金国家自然科学基金国家自然科学基金

2021M693002719914857199148171991480

2024

无线电工程
中国电子科技集团公司第五十四研究所

无线电工程

影响因子:0.667
ISSN:1003-3106
年,卷(期):2024.54(6)
  • 1