Research on mahjong game combining A2C with hand value evaluation method
To address the underutilizing hand information in popular mahjong, this paper designs a hand valuation method and a basic mahjong program ( MJE) .Mahjong AI ( MJE-RL) is designed by using the deep reinforcement learning approach to further improve its gaming ability.First, the training data of deep learning is generated by MJE' s self-play.Second, the best model is selected as the pre-training model of reinforcement learning, according to the results of training set, test set and comparison experiment.Finally, the Advantage Actor-Critic ( A2C) model is employed as the main framework of reinforcement learning.The well-trained deep learning model is used as the Actor to make decisions, and the game ability of mahjong AI is constantly improved by playing between MJE-RL and MJE.Our experimental results indicate the winning rate of MJE-RL is 4 .08% higher than that of MJE and the rate of Win by Discard is 3.02% lower than that of MJE.Meanwhile, it is shown that MJE-RL markedly improves both offensive and defensive fronts, demonstrating improved overall strength of mahjong AI.
popular mahjongincomplete informationdeep reinforcement learningA2C