首页|基于注意力头数和词性融合的藏文预训练模型

基于注意力头数和词性融合的藏文预训练模型

扫码查看
为了更好地学习藏文语言特征以及探究藏文预训练语言模型的最佳注意力机制头数,将词性与藏文预训练模型相结合,并进行了对比实验确定最佳的注意力头数,旨在提高语言模型对藏文语言特征的理解以及下游任务的性能.实验结果表明,在多个分类任务中,注意力头数为12的预训练模型皆表现了良好的性能.此外,将词性融入预训练模型后,文本、标题和情感分类任务的模型F,值分别提高了 0.57%、0.92%和1.01%.实验结果证明融入词性特征后,模型可以更准确地理解藏文语言结构和语法规则,从而提高分类任务的准确率.
Tibetan Pre-training Model Based on Attention Heads and Part-of-Speech Fusion
In order to acquire superior Tibetan characteristics and enhance the model's understanding of Tibetan features,part-of-speech was combined with the Tibetan pre-trained language model.Meanwhile,improving the performance of downstream tasks,the optimal attention mechanism head number of Tibetan pre-trained language model were explored by comparative experiments.The results show that pre-trained language models with 12 attention heads perform well in multiple classification tasks.Furthermore,after incorporating part-of-speech into the pre-trained language models,the macroF1 values of text,title and sentiment classification tasks increase by 0.57%,0.92%and 1.01%respectively.It is conclued that after incorporating part-of-speech features,the language structure and grammar rules of Tibetan can be better understanded.

attention mechanismpart-of-speechpre-train language modelstext classificationsentiment classification

张英、拥措、斯曲卓嘎、拉毛杰、扎西永珍、尼玛扎西

展开 >

西藏大学信息科学技术学院,拉萨 850000

西藏自治区藏文信息技术人工智能重点实验室,拉萨 850000

藏文信息技术教育部工程研究中心,拉萨 850000

注意力机制 词性 预训练语言模型 文本分类 情感分类

科技创新2030——"新一代人工智能"重大项目西藏自治区科技厅项目

2022ZD0116100XZ202401JD0010

2024

科学技术与工程
中国技术经济学会

科学技术与工程

CSTPCD北大核心
影响因子:0.338
ISSN:1671-1815
年,卷(期):2024.24(23)