基于BERT-BiLSTM-CRF党建领域命名实体识别
Named Entity Recognition in Field of Party Building Based on BERT-BiLSTM-CRF
赵盾 1佘学兵 2邬昌兴3
作者信息
- 1. 福建农林大学金山学院,福建 福州 350002
- 2. 江西科技学院,江西 南昌 330098
- 3. 华东交通大学,江西 南昌 330013
- 折叠
摘要
党建领域知识图谱构建过程中使用传统的命名实体识别方法时,存在实体边界不清、实体词性多义等问题,导致存在识别准确率和效率低的问题.为此,本文提出一种融合树形概率和领域词典的BERT-BiLSTM-CRF实体识别模型.该模型在BERT中嵌入领域词典进行文本向量化表示;利用BiLSTM获取上下文语义特征;将树形概率应用到CRF层的转移概率计算中提高分词准确率.与基准模型在MSRA和自构建的语料库上进行实验对比,实验结果表明本模型在F1值、召回率、精确率3个指标上都能取得较好的效果.
Abstract
When constructing a knowledge graph in the field of party building,the traditional named entity recognition(NER)methods often suffer from unclear entity boundaries and polysemy of entity terms,which lead to low recognition accuracy and effi-ciency.To address these issues,this paper proposes a BERT-BiLSTM-CRF entity recognition model that integrates tree-like probability and a domain dictionary.The model involves embedding the domain dictionary into BERT for text vectorization,uti-lizes BiLSTM to acquire contextual semantic features,and applies tree-like probability to the transition probability calculation in the CRF layer to enhance word segmentation accuracy.The experimental results on the MSRA and self-constructed corpora,compared with the baseline model,show that the proposed model achieves better performance in terms of F1-score,recall,and precision.
关键词
BERT-BiLSTM-CRF模型/树形概率/领域词典/命名实体识别Key words
BERT-BiLSTM-CRF model/tree-like probability/domain dictionary/name entity recognition引用本文复制引用
基金项目
国家自然科学基金地区科学基金资助项目(62266017)
江西省教育厅科技项目(GJJ2202608)
出版年
2024