首页|一种大众麻将计算机博弈的快速出牌方法

一种大众麻将计算机博弈的快速出牌方法

扫码查看
麻将是一种典型的不完美信息博弈的项目,目前对于麻将问题的解决方法大多朝着深度强化学习方向进行研究,也取得了非常好的效果.但是,此类麻将AI都是建立在有高质量数据集基础上的,而大众麻将缺少关键的大量有效标注的数据集,因此,如何在对弈中快速出牌就成为主要问题.针对以上问题,对出牌动作进行研究,以启发式快速出牌为思路,提出了面向敌方胡牌牌张的蒙特卡洛评估法,将启发式快速出牌方法和蒙特卡洛评估法相结合,对每张手牌进行估值计算,通过估值分数决定每轮出牌牌张.以历史出牌次数为分界点,以此分界将博弈过程时序化为前后2个决策时段,前段采用启发式快速出牌方法,后段采用蒙特卡洛评估法.通过前后时段法分层递进决策处理过程,给出最佳出牌着法,能有效减少出牌的决策时间并降低点炮率.采用所提方法的程序在中国计算机博弈锦标赛中获得了一等奖,证明了该方法的有效性.
A fast discard method of public mahjong computer game
Mahjong is a typical game of imperfect information.Currently, most solutions to mahjong problems are studied in the direction of deep reinforcement learning, and fairly good results have been achieved.However, such mahjong AI is built on the basis of high-quality data sets, and the mass mahjong lacks a large number of critical and effectively labeled data sets.How to quickly play cards in the game has become the main problem.To address it, the paper studies the action of playing cards and puts forward the Monte Carlo evaluation method against the opponent ' s cards guided by the heuristic quick card playing.By integrating the heuristic quick card playing method with Monte Carlo evaluation method, the paper evaluates each hand card and determines each round of playing cards through the valuation score.The empirical knowledge is initially employed to build a demarcation point with a certain number of historical card playing times, and the game process is divided into two decision periods.The heuristic fast card playing method is used in the first period, and the Monte Carlo evaluation method in the second period.The optimal playing method is given through the hierarchical and progressive decision-making process of the front and back time method, effectively reducing the decision time of playing cards and the point shot rate.The program using this method wins the first prize in the Chinese Computer Game Tournament, demonstrating its effectiveness.

computer gameimperfect information gamemahjong gameheuristic fast discardMonte Carlo method of evaluation

张小川、严明珠、涂飞、陈俊宇、魏乐天

展开 >

重庆理工大学 两江人工智能学院, 重庆 401120

计算机博弈 不完美信息博弈 麻将 启发式快速出牌 蒙特卡洛评估法

国家自然科学基金项目重庆市技术创新与应用发展专项项目

60443004cstc2021jscxdxwtBX0019

2024

重庆理工大学学报
重庆理工大学

重庆理工大学学报

CSTPCD北大核心
影响因子:0.567
ISSN:1674-8425
年,卷(期):2024.38(9)
  • 15