基于注意力头数和词性融合的藏文预训练模型

Tibetan Pre-training Model Based on Attention Heads and Part-of-Speech Fusion

扫码查看

原文链接

维普
万方数据

中文摘要：为了更好地学习藏文语言特征以及探究藏文预训练语言模型的最佳注意力机制头数,将词性与藏文预训练模型相结合,并进行了对比实验确定最佳的注意力头数,旨在提高语言模型对藏文语言特征的理解以及下游任务的性能.实验结果表明,在多个分类任务中,注意力头数为12的预训练模型皆表现了良好的性能.此外,将词性融入预训练模型后,文本、标题和情感分类任务的模型F,值分别提高了 0.57％、0.92％和1.01％.实验结果证明融入词性特征后,模型可以更准确地理解藏文语言结构和语法规则,从而提高分类任务的准确率.

外文摘要：In order to acquire superior Tibetan characteristics and enhance the model's understanding of Tibetan features,part-of-speech was combined with the Tibetan pre-trained language model.Meanwhile,improving the performance of downstream tasks,the optimal attention mechanism head number of Tibetan pre-trained language model were explored by comparative experiments.The results show that pre-trained language models with 12 attention heads perform well in multiple classification tasks.Furthermore,after incorporating part-of-speech into the pre-trained language models,the macroF1 values of text,title and sentiment classification tasks increase by 0.57％,0.92％and 1.01％respectively.It is conclued that after incorporating part-of-speech features,the language structure and grammar rules of Tibetan can be better understanded.

外文关键词：

attention mechanismpart-of-speechpre-train language modelstext classificationsentiment classification

作者：

张英、拥措、斯曲卓嘎、拉毛杰、扎西永珍、尼玛扎西

展开 >

作者单位：

西藏大学信息科学技术学院,拉萨 850000

西藏自治区藏文信息技术人工智能重点实验室,拉萨 850000

藏文信息技术教育部工程研究中心,拉萨 850000

关键词：

注意力机制词性预训练语言模型文本分类情感分类

基金：

科技创新2030——"新一代人工智能"重大项目西藏自治区科技厅项目

项目编号：

2022ZD0116100XZ202401JD0010

出版年：

2024

DOI：

10.12404/j.issn.1671-1815.2305750

科学技术与工程

中国技术经济学会

科学技术与工程

CSTPCD北大核心

影响因子：0.338

ISSN：1671-1815

年,卷(期)：2024.24(23)