计算机仿真2024,Vol.41Issue(6) :396-400,502.

基于多特征融合的藏文命名实体识别方法研究

Research on tibetan named entity recognition based on multi-feature fusion

索朗次仁 杨宇帆 高定国
计算机仿真2024,Vol.41Issue(6) :396-400,502.

基于多特征融合的藏文命名实体识别方法研究

Research on tibetan named entity recognition based on multi-feature fusion

索朗次仁 1杨宇帆 1高定国1
扫码查看

作者信息

  • 1. 西藏大学信息科学技术学院,西藏 拉萨 850000
  • 折叠

摘要

藏文命名实体识别是藏文信息处理必须解决的关键问题之一,针对以往命名实体识别中特征提取较为单一的问题,提出基于多特征融合的IDCNN-BiGRU-CRF模型,将藏文分词与藏文实体联合训练,并融合双向门限循环单元网络(BiGRU)和迭代膨胀卷积网络(IDCNN),分别提取上下文特征和局部特征,融合两种特征信息以提升实体识别效果.实验证明,对比目前主流的藏文命名实体识别方法BiLSTM-CRF、BiGRU-CRF、IDCNN-BiLSTM-CRF,该模型的F1 分别提升了 0.51%、0.47%、0.14%,验证了上述模型在藏文命名实体识别任务中的有效性.

Abstract

Tibetan named entity recognition is one of the key issues that must be solved in Tibetan language pro-cessing.Aiming at the problem that feature extraction is relatively single in the previous named entity recognition,this paper proposes an IDCNN-BiGRU-CRF model based on multi-feature fusion,which combines Tibetan word segmen-tation and Tibetan entity training,and integrates bidirectional threshold loop unit network(BiGRU)and iterative ex-pansive convolutional network(IDCNN),extracting contextual features and local features respectively,and fusing two feature information to improve the entity recognition effect.Experimental results show that compared with the current mainstream Tibetan named recognition methods BiLSTM-CRF,BiGRU-CRF,IDCNN-BiLSTM-CRF,F1 of the model is improved by 0.51%,0.47%and 0.14%respectively,which verifies the effectiveness of the model in the Tibetan named entity recognition task.

关键词

多特征融合/藏文/命名实体识别

Key words

Multi-feature fusion/Tibetan/Named entity recognition

引用本文复制引用

基金项目

国家自然基金(62166038)

出版年

2024
计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
段落导航相关论文