DouDiZhu strategy based on DDQN improvement method
Based on the problems of some existing methods such as long training time,large action space and low success rate in card games,an improved method for the network architecture and encoded mode of DDQN algorithm is proposed.This method uses binary thought to encode the cards,divides the neural net-work into the main card neural network and the kicker card neural network based on the card splitting meth-od,and adds GRU neural network to process the sequence actions.The experiment shows that the training time of the algorithm is 13%shorter than that of the traditional DQN algorithm,and the average winning rate in the'landlord'and the'farmer'positions is 70%and 75%,higher than that of the DQN algorithm by 28%and 60%,which proves the advantages of the improved algorithm in some of the above indicators.
deep reinforcement learningDouble deep Q-learningcomputer gamesGate Recurrent Unit networklarge scale discrete action space
孔燕、吴晓聪、芮烨锋、史鸿远
展开 >
南京信息工程大学计算机学院,南京 210044
南京信息工程大学数字取证教育部工程研究中心,南京 210044
深度强化学习 Double deep Q-learning 计算机博弈 Gate Recurrent Unit神经网络 大规模离散动作空间