计算机工程与科学2024,Vol.46Issue(5) :916-928.DOI:10.3969/j.issn.1007-130X.2024.05.017

基于对span的预判断和多轮分类的实体关系抽取

Entity relation extraction based on prejudgment and multi-round classification for span

佟缘 姚念民
计算机工程与科学2024,Vol.46Issue(5) :916-928.DOI:10.3969/j.issn.1007-130X.2024.05.017

基于对span的预判断和多轮分类的实体关系抽取

Entity relation extraction based on prejudgment and multi-round classification for span

佟缘 1姚念民1
扫码查看

作者信息

  • 1. 大连理工大学计算机科学与技术学院,辽宁 大连 116024
  • 折叠

摘要

针对自然语言处理领域中的实体识别和关系抽取任务,提出一种对词元序列(Token Se-quence,又称 span)进行预测的模型 Smrc.模型整体上利用 BERT 预训练模型作为编码器,另外包含实体预判断(Pej)、实体多轮分类(Emr)和关系多轮分类(Rmr)3 个模块.Smrc 模型通过 Pej 模块的初步判断及Emr模块的多轮实体分类来进行实体识别,再利用 Rmr 模块的多轮关系分类来判断实体对间的关系,进而完成关系抽取任务.在CoNLL04、SciERC和ADE 3 个实验数据集上,Smrc模型的实体识别F1值分别达到 89.67%,70.62%和 89.56%,关系抽取F1 值分别达到 73.11%,51.03%和 79.89%,相较之前在 3 个数据集上的最佳模型Spert,Smrc模型凭借实体预判断和实体及关系多轮分类,在 2 个子任务上其F1 值分别提高了 0.73%,0.29%,0.61%及 1.64%,0.19%,1.05%,表明了该模型的有效性及其优势.

Abstract

Aiming at entity recognition and relation extraction tasks in natural language processing,a model named Smrc is proposed,which makes predictions at the token sequence(span)level.The model uses BERT pre-training model as an encoder and include three modules:entity pre-judgment(Pej),en-tity multi-round classification(Emr)and relation multi-round classification(Rmr).The Smrc model performs entity recognition through the preliminary judgment of the Pej module and the multi-round en-tity classification of the Emr module,and then uses the Rmr module's multi-round relation classification to determine the relationships between entities,thus completing the relation extraction task.On the ex-perimental datasets of CoNLL04,SciERC,and ADE,the F1 values of entity recognition reach 89.67%,70.62%,and 89.56%,respectively,and the F1 values of relation extraction reach 73.11%,51.03%,and 79.89%,respectively.Compared with the previous best model Spert on the three datasets,the Smrc model achieves improvements of 0.73%,0.29%,and 0.61%in entity recognition and 1.64%,0.19%,and 1.05%in relation extraction through entity pre-judgment and multi-round classification of entities and relations,which demonstrates the effectiveness and advantages of the model.

关键词

对span的预判断/实体关系抽取/BERT预训练模型/多轮实体分类/多轮关系分类

Key words

pre-judgment of span/entity relation extraction/BERT pretraining model/multi-round entity classification/multi-round relation classification

引用本文复制引用

出版年

2024
计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
参考文献量33
段落导航相关论文