首页|基于数据增强的MRC水利领域命名实体识别模型研究

基于数据增强的MRC水利领域命名实体识别模型研究

扫码查看
水利领域命名实体识别对水利知识图谱构建、水利智能问答系统构建等具有重要意义,但当前水利领域命名实体识别存在缺乏标注语料、传统方法识别精度低和无法解决多义实体等不足.针对水利文本特点,提出基于数据(词汇和实体类型标签)增强的机器阅读理解(MRC)命名实体识别模型,即MRC-WLE模型,主要是将水利文本中词汇特征信息和实体类型标签特征信息作为"知识"注入模型.引入BERT-CRF、BERT-CRF-Word、BERT-BiLSTM-CRF、BERT-BiLSTM-CRF-Word等模型作为对照,评价MRC-WLE模型的性能.结果表明:与上述BERT-CRF等模型相比,MRC-WLE模型的微平均F1 值均有所提高.与MRC模型相比,MRC-WLE模型的微平均F1 值提高了 0.85%,体现了数据增强的有效性.
Research on Named Entity Recognition of MRC Model in Water Conservancy Field Based on Data Enhancement
The recognition of named entities in the field of water conservancy is of great significance for the building of water conservancy knowledge graphs and intelligent question answering systems.However,in the current field of water conservancy,there are shortcomings in named entity recognition,such as a lack of annotated corpus,low recognition accuracy of traditional methods and inability to solve polyse-mous entities.Aiming at the characteristics of water conservancy texts,a Named Entity Recognition Model for Machine Reading Comprehen-sion(MRC)based on data(vocabulary and entity type labels)enhancement,namely the MRC-WLE model was put forward.Mainly,the vo-cabulary feature information and entity type label feature information in water conservancy texts were injected into the model as"knowledge".It introduced models such as BERT-CRF,BERT-CRF-Word,BERT-BiLSTM-CRF and BERT-BiLSTM-CRF-Word as controls to evaluate the performance of the MRC-WLE model.The results show that compared with the BERT-CRF and other models mentioned above,the micro av-erage F1 value of the MRC-WLE model has been improved.Compared with the MRC model,the micro average F1 value of the MRC-WLE model has been increased by 0.85%,reflecting the effectiveness of data augmentation.

water conservancy fieldnamed entity recognitiondata enhancedMRC

朱永明、邢丹艳

展开 >

郑州大学 管理学院,河南 郑州 450001

水利领域 命名实体识别 数据增强 机器阅读理解

教育部人文社会科学研究一般项目中国学位与研究生教育学会重大课题

20YJA6301012020ZDB20

2024

人民黄河
水利部黄河水利委员会

人民黄河

CSTPCD北大核心
影响因子:0.494
ISSN:1000-1379
年,卷(期):2024.46(9)