融合自注意力和实体类型知识的实体关系联合抽取模型

An Entity-Relation Extraction Model Fusing Self-Attention Mechanism and Entity Type Knowledge

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：从非结构化文本中抽取实体关系三元组是自然语言处理中的主要任务形式之一.目前主流的方法是采用联合式抽取,能够在训练过程中自动捕捉到实体与关系间的依赖知识,提高了实体和关系的抽取效果.但这些方法忽略了实体的类型知识,导致大量的冗余计算和错误结果的产生.鉴于此,文中提出一种融合注意力和实体类型知识的实体关系联合抽取方法.首先,采用预训练模型BERT作为编码器得到句子中各字符的向量表示,再经双向LSTM层处理得到最终的语义表示;其次,基于表示层的结果完成头、尾实体的识别;接着,通过融合不同头实体的语义信息到句子表示中,实现头实体类型约束下的潜在语义关系发现;最后,将头实体和关系分别输入自注意力模块识别出对应尾实体,得到实体关系三元组.通过在公开数据集NYT和WebNLG上的大量实验表明:文中所提模型在实体关系联合抽取任务中的F1值达到了 93.2％和93.3％,与当前主流模型相比提升显著.

外文摘要：Entity-relation extraction from unstructured text has become a key task in natural language pro-cessing.At present,the mainstream methods adopt jointly extraction,which can automatically capture the dependent knowledge between entity and relation in the trainning process,and improve the extraction effect of entity and relation.However,these methods ignore the type knowledge of entiies,which leads to a lot of redundent calculations and icorrect results.To this end,we present a joint entity-relation extrac-tion method that integrates self-attention mechanisem and entity type knowledge.Firstly,the pretrained model BERT is used as the encoder to get the vector representation of each character in the sentence,and then the final semantic representation is obtained through the bidirectional LSTM layer processing.Sec-ondly,the head and tail entities are identified based on the results of the encoder layer.Then,the se-mantic representation of different head entities is iteratively integrated into the sentence representation to realize the potential semantic relation detection under the constraints of the type of head entities.Finally,input the head entity and relation respectively into the self-attention module to identify the corresponding tail entity and get the entity-relation triples.Experimental results on public datasets of NYT and WebNLG show that the F1 value of our proposed model in the entity-relation joint extraction task achiveves 93.2％and 93.3％,which is significantly improved compared with the current mainstream models.

外文关键词：

self-attention mechanismBERTentity-relation triplesjoint extraction

作者：

张思邈、朱继召、刘颢、范纯龙

展开 >

作者单位：

沈阳航空航天大学计算机学院,辽宁沈阳 110136

海交通大学电子信息与电气工程学院,上海 200240

武汉数字工程研究所,湖北武汉 430074

关键词：

自注意力机制 BERT 实体关系三元组联合抽取

基金：

国家自然科学基金

项目编号：

62076249

出版年：

2024

DOI：

10.3969/j.issn.1673-5692.2024.01.011

中国电子科学研究院学报

中国电子科学研究院

中国电子科学研究院学报

影响因子：0.663

ISSN：1673-5692

年,卷(期)：2024.19(1)

参考文献量20