中国电子科学研究院学报2024,Vol.19Issue(1) :84-90.DOI:10.3969/j.issn.1673-5692.2024.01.011

融合自注意力和实体类型知识的实体关系联合抽取模型

An Entity-Relation Extraction Model Fusing Self-Attention Mechanism and Entity Type Knowledge

张思邈 朱继召 刘颢 范纯龙
中国电子科学研究院学报2024,Vol.19Issue(1) :84-90.DOI:10.3969/j.issn.1673-5692.2024.01.011

融合自注意力和实体类型知识的实体关系联合抽取模型

An Entity-Relation Extraction Model Fusing Self-Attention Mechanism and Entity Type Knowledge

张思邈 1朱继召 1刘颢 2范纯龙1
扫码查看

作者信息

  • 1. 沈阳航空航天大学计算机学院,辽宁沈阳 110136
  • 2. 海交通大学电子信息与电气工程学院,上海 200240;武汉数字工程研究所,湖北武汉 430074
  • 折叠

摘要

从非结构化文本中抽取实体关系三元组是自然语言处理中的主要任务形式之一.目前主流的方法是采用联合式抽取,能够在训练过程中自动捕捉到实体与关系间的依赖知识,提高了实体和关系的抽取效果.但这些方法忽略了实体的类型知识,导致大量的冗余计算和错误结果的产生.鉴于此,文中提出一种融合注意力和实体类型知识的实体关系联合抽取方法.首先,采用预训练模型BERT作为编码器得到句子中各字符的向量表示,再经双向LSTM层处理得到最终的语义表示;其次,基于表示层的结果完成头、尾实体的识别;接着,通过融合不同头实体的语义信息到句子表示中,实现头实体类型约束下的潜在语义关系发现;最后,将头实体和关系分别输入自注意力模块识别出对应尾实体,得到实体关系三元组.通过在公开数据集NYT和WebNLG上的大量实验表明:文中所提模型在实体关系联合抽取任务中的F1值达到了 93.2%和93.3%,与当前主流模型相比提升显著.

Abstract

Entity-relation extraction from unstructured text has become a key task in natural language pro-cessing.At present,the mainstream methods adopt jointly extraction,which can automatically capture the dependent knowledge between entity and relation in the trainning process,and improve the extraction effect of entity and relation.However,these methods ignore the type knowledge of entiies,which leads to a lot of redundent calculations and icorrect results.To this end,we present a joint entity-relation extrac-tion method that integrates self-attention mechanisem and entity type knowledge.Firstly,the pretrained model BERT is used as the encoder to get the vector representation of each character in the sentence,and then the final semantic representation is obtained through the bidirectional LSTM layer processing.Sec-ondly,the head and tail entities are identified based on the results of the encoder layer.Then,the se-mantic representation of different head entities is iteratively integrated into the sentence representation to realize the potential semantic relation detection under the constraints of the type of head entities.Finally,input the head entity and relation respectively into the self-attention module to identify the corresponding tail entity and get the entity-relation triples.Experimental results on public datasets of NYT and WebNLG show that the F1 value of our proposed model in the entity-relation joint extraction task achiveves 93.2%and 93.3%,which is significantly improved compared with the current mainstream models.

关键词

自注意力机制/BERT/实体关系三元组/联合抽取

Key words

self-attention mechanism/BERT/entity-relation triples/joint extraction

引用本文复制引用

基金项目

国家自然科学基金(62076249)

出版年

2024
中国电子科学研究院学报
中国电子科学研究院

中国电子科学研究院学报

CSTPCD
影响因子:0.663
ISSN:1673-5692
参考文献量20
段落导航相关论文