首页|TibetanGoTinyNet:一种应用于藏式围棋的U型网络风格的轻量级零学习模型

TibetanGoTinyNet:一种应用于藏式围棋的U型网络风格的轻量级零学习模型

扫码查看
藏式围棋面临专家知识和研究文献匮乏的问题.因此,我们研究了有限计算能力资源下藏式围棋的零学习模型,并提出一种新颖的尺度不变U型网络(U-Net)风格的双头输出轻量级网络TibetanGoTinyNet.该网络的编码和解码器应用了轻量级卷积神经网络(CNN)和胶囊网络,以减少计算负担并提升特征提取效果.网络中集成了数种自注意力机制,以捕获藏式围棋棋盘的空间和全局信息,并选择有价值通道.训练数据完全由自我对弈生成.TibetanGoTinyNet 在 与 Res-UNet,Res-UNet Attention,Ghost-UNet 和 Ghost Capsule-UNet4个U-Net风格模型的对弈中获得了62%-78%的胜率.在捕获棋盘位置信息的轻量级自注意机制消融实验中,它也实现了75%的胜率.当模型从9x9棋盘直接迁移到11x11棋盘时,该模型在不同的蒙特卡洛树搜索(MCTS)次数下节省了约33%的训练时间,并获得了45%-50%的胜率.本文模型代码可在 https://github.com/paulzyy/TibetanGoTinyNet 上获取.
TibetanGoTinyNet:a lightweight U-Net style network for zero learning of Tibetan Go
The game of Tibetan Go faces the scarcity of expert knowledge and research literature.Therefore,we study the zero learning model of Tibetan Go under limited computing power resources and propose a novel scale-invariant U-Net style two-headed output lightweight network TibetanGoTinyNet.The lightweight convolutional neural networks and capsule structure are applied to the encoder and decoder of TibetanGoTinyNet to reduce computational burden and achieve better feature extraction results.Several autonomous self-attention mechanisms are integrated into TibetanGoTinyNet to capture the Tibetan Go board's spatial and global information and select important channels.The training data are generated entirely from self-play games.TibetanGoTinyNet achieves 62%-78%winning rate against other four U-Net style models including Res-UNet,Res-UNet Attention,Ghost-UNet,and Ghost Capsule-UNet.It also achieves 75%winning rate in the ablation experiments on the attention mechanism with embedded positional information.The model saves about 33%of the training time with 45%-50%winning rate for different Monte-Carlo tree search(MCTS)simulation counts when migrated from 9x9 to 11x11 boards.Code for our model is available at https://github.com/paulzyy/TibetanGoTinyNet.

Zero learningTibetan GoU-NetSelf-attention mechanismCapsule networkMonte-Carlo tree search

李霞丽、张焱垠、吴立成、陈彦东、喻俊志

展开 >

中央民族大学民族语言智能分析与安全治理教育部重点实验室,中国 北京市,100081

中央民族大学信息工程学院,中国 北京市,100081

北京大学工学院先进制造与机器人系,中国 北京市,100871

零学习 藏式围棋 U型网络 自注意力机制 胶囊网络 蒙特卡洛树搜索

National Natural Science Foundation of ChinaNational Natural Science Foundation of ChinaMajor Projects of Social Science Fundation of China

622762856223601120&ZD279

2024

信息与电子工程前沿(英文)
浙江大学

信息与电子工程前沿(英文)

CSTPCD
影响因子:0.371
ISSN:2095-9184
年,卷(期):2024.25(7)