计算机仿真2024,Vol.41Issue(7) :484-490.

一种基于强化学习的三国杀多智能体博弈方法

A 2v2 Three-Country Killing Multi-Agent Game Method Based on Reinforcement Learning

骆芙蓉 王以松 秦进 于小民
计算机仿真2024,Vol.41Issue(7) :484-490.

一种基于强化学习的三国杀多智能体博弈方法

A 2v2 Three-Country Killing Multi-Agent Game Method Based on Reinforcement Learning

骆芙蓉 1王以松 1秦进 1于小民1
扫码查看

作者信息

  • 1. 贵州大学计算机科学与技术学院,贵州 贵阳 550025;贵州大学人工智能研究院,贵州 贵阳 550025
  • 折叠

摘要

深度强化学习在处理序列决策与策略探索问题上取得了很大的成功,大多从游戏中展开研究获得启发,其应用领域从单智能体场景扩展到多智能体场景中.基于纸牌的多人对战策略游戏是一种多智能体系统,但现有研究较少,且大多都来自于斗地主、德州扑克.为拓展基于纸牌的多智能体策略游戏的研究,提出了一种基于强化学习的三国杀多智能体博弈方法(SGS-MAPG),自建了以三国杀游戏为背景的2v2 对战游戏场景作为实验环境,基于策略梯度的思想对合作的多个智能体建模,在其决策过程中包含了多智能体系统的团队合作与对抗,解决了多个智能体环境下的不稳定性问题.经计算机模拟对战过程,上述方法使智能体经过训练具有良好的学习决策能力,并且能够尝试获得多于基础算法的最终团队奖励,并得到高出至少 12%胜率.

Abstract

Deep reinforcement learning has achieved great success in dealing with sequential decision-making and strategy exploration,and most of them are inspired by in-game research,and its application field has expanded from single-agent scenarios to multi-agent scenarios.Solitaire-based multiplayer strategy games are a multi-agent system,but there are few existing studies,and most of them come from Doudi Landlord and Texas Hold'em.In order to expand the research of multi-agent strategy games based on cards,this paper proposes a 2v2 three-country killing multi-agent game method(SGS-MAPG)based on reinforcement learning,which builds a 2v2 battle game scene with the background of three-kingdom killing game as the experimental environment,models cooperative multiple agents based on the idea of strategy gradient,and includes teamwork and confrontation of multi-agent systems in its decision-making process,which solves the problem of instability in multiple agent environments.Through computer simulation of the battle process,this method enables the agent to be trained to have good learning and decision-making ability,and can try to obtain more final team rewards than the basic algorithm,and get at least 12%higher win rate.

关键词

深度强化学习/多智能体/三国杀游戏环境/合作对抗

Key words

Deep reinforcement learning/Multi-agent/Three kingdoms killing game environment/Cooperative competition

引用本文复制引用

基金项目

国家自科学基金项目(U1836205)

出版年

2024
计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
参考文献量3
段落导航相关论文