科学技术与工程2024,Vol.24Issue(23) :9957-9964.DOI:10.12404/j.issn.1671-1815.2305750

基于注意力头数和词性融合的藏文预训练模型

Tibetan Pre-training Model Based on Attention Heads and Part-of-Speech Fusion

张英 拥措 斯曲卓嘎 拉毛杰 扎西永珍 尼玛扎西
科学技术与工程2024,Vol.24Issue(23) :9957-9964.DOI:10.12404/j.issn.1671-1815.2305750

基于注意力头数和词性融合的藏文预训练模型

Tibetan Pre-training Model Based on Attention Heads and Part-of-Speech Fusion

张英 1拥措 1斯曲卓嘎 1拉毛杰 1扎西永珍 1尼玛扎西1
扫码查看

作者信息

  • 1. 西藏大学信息科学技术学院,拉萨 850000;西藏自治区藏文信息技术人工智能重点实验室,拉萨 850000;藏文信息技术教育部工程研究中心,拉萨 850000
  • 折叠

摘要

为了更好地学习藏文语言特征以及探究藏文预训练语言模型的最佳注意力机制头数,将词性与藏文预训练模型相结合,并进行了对比实验确定最佳的注意力头数,旨在提高语言模型对藏文语言特征的理解以及下游任务的性能.实验结果表明,在多个分类任务中,注意力头数为12的预训练模型皆表现了良好的性能.此外,将词性融入预训练模型后,文本、标题和情感分类任务的模型F,值分别提高了 0.57%、0.92%和1.01%.实验结果证明融入词性特征后,模型可以更准确地理解藏文语言结构和语法规则,从而提高分类任务的准确率.

Abstract

In order to acquire superior Tibetan characteristics and enhance the model's understanding of Tibetan features,part-of-speech was combined with the Tibetan pre-trained language model.Meanwhile,improving the performance of downstream tasks,the optimal attention mechanism head number of Tibetan pre-trained language model were explored by comparative experiments.The results show that pre-trained language models with 12 attention heads perform well in multiple classification tasks.Furthermore,after incorporating part-of-speech into the pre-trained language models,the macroF1 values of text,title and sentiment classification tasks increase by 0.57%,0.92%and 1.01%respectively.It is conclued that after incorporating part-of-speech features,the language structure and grammar rules of Tibetan can be better understanded.

关键词

注意力机制/词性/预训练语言模型/文本分类/情感分类

Key words

attention mechanism/part-of-speech/pre-train language models/text classification/sentiment classification

引用本文复制引用

基金项目

科技创新2030——"新一代人工智能"重大项目(2022ZD0116100)

西藏自治区科技厅项目(XZ202401JD0010)

出版年

2024
科学技术与工程
中国技术经济学会

科学技术与工程

CSTPCD北大核心
影响因子:0.338
ISSN:1671-1815
段落导航相关论文