TibetanGoTinyNet:一种应用于藏式围棋的U型网络风格的轻量级零学习模型

TibetanGoTinyNet:a lightweight U-Net style network for zero learning of Tibetan Go

李霞丽 ¹张焱垠 ¹吴立成 ¹陈彦东 ¹喻俊志²

扫码查看

作者信息

1. 中央民族大学民族语言智能分析与安全治理教育部重点实验室,中国北京市,100081;中央民族大学信息工程学院,中国北京市,100081
2. 北京大学工学院先进制造与机器人系,中国北京市,100871
折叠

摘要

藏式围棋面临专家知识和研究文献匮乏的问题.因此,我们研究了有限计算能力资源下藏式围棋的零学习模型,并提出一种新颖的尺度不变U型网络(U-Net)风格的双头输出轻量级网络TibetanGoTinyNet.该网络的编码和解码器应用了轻量级卷积神经网络(CNN)和胶囊网络,以减少计算负担并提升特征提取效果.网络中集成了数种自注意力机制,以捕获藏式围棋棋盘的空间和全局信息,并选择有价值通道.训练数据完全由自我对弈生成.TibetanGoTinyNet 在与 Res-UNet,Res-UNet Attention,Ghost-UNet 和 Ghost Capsule-UNet4个U-Net风格模型的对弈中获得了62％-78％的胜率.在捕获棋盘位置信息的轻量级自注意机制消融实验中,它也实现了75％的胜率.当模型从9x9棋盘直接迁移到11x11棋盘时,该模型在不同的蒙特卡洛树搜索(MCTS)次数下节省了约33％的训练时间,并获得了45％-50％的胜率.本文模型代码可在 https://github.com/paulzyy/TibetanGoTinyNet 上获取.

Abstract

The game of Tibetan Go faces the scarcity of expert knowledge and research literature.Therefore,we study the zero learning model of Tibetan Go under limited computing power resources and propose a novel scale-invariant U-Net style two-headed output lightweight network TibetanGoTinyNet.The lightweight convolutional neural networks and capsule structure are applied to the encoder and decoder of TibetanGoTinyNet to reduce computational burden and achieve better feature extraction results.Several autonomous self-attention mechanisms are integrated into TibetanGoTinyNet to capture the Tibetan Go board's spatial and global information and select important channels.The training data are generated entirely from self-play games.TibetanGoTinyNet achieves 62％-78％winning rate against other four U-Net style models including Res-UNet,Res-UNet Attention,Ghost-UNet,and Ghost Capsule-UNet.It also achieves 75％winning rate in the ablation experiments on the attention mechanism with embedded positional information.The model saves about 33％of the training time with 45％-50％winning rate for different Monte-Carlo tree search(MCTS)simulation counts when migrated from 9x9 to 11x11 boards.Code for our model is available at https://github.com/paulzyy/TibetanGoTinyNet.

关键词

零学习/藏式围棋/U型网络/自注意力机制/胶囊网络/蒙特卡洛树搜索

Key words

Zero learning/Tibetan Go/U-Net/Self-attention mechanism/Capsule network/Monte-Carlo tree search

引用本文复制引用

基金项目

National Natural Science Foundation of China(62276285)

National Natural Science Foundation of China(62236011)

Major Projects of Social Science Fundation of China(20&ZD279)

出版年

2024

信息与电子工程前沿(英文)

浙江大学

信息与电子工程前沿(英文)

CSTPCD

影响因子：0.371

ISSN：2095-9184

段落导航