一种基于异构图神经网络和文本语义增强的实体关系抽取方法
Method for Entity Relation Extraction Based on Heterogeneous Graph Neural Networks and Text Semantic Enhancement
彭勃 1李耀东 1龚贤夫 1李浩2
作者信息
- 1. 广东电网有限责任公司电网规划研究中心 广州 510080
- 2. 四川大学计算机学院 成都 610065
- 折叠
摘要
信息化时代,如何从海量自然语言文本中提取结构化信息已经成为研究热点.电力系统中繁杂的知识信息需要通过构建知识图谱来解决,而实体关系抽取是其上游的信息抽取任务,其完成度直接关系到知识图谱的有效性.而随着深度学习的不断发展,利用深度学习技术来完成实体关系抽取任务的研究逐渐展开并取得了良好的效果.然而目前依然存在文本语义应用不完全等问题.针对这些问题本文尝试提出了一种基于异构图神经网络和文本语义增强的实体关系抽取方法,该方法使用词节点与关系节点学习语义特征,并通过BRET与预训练任务分别获得两种节点的初始特征,使用多层图网络结构迭代更新,并在每一层中使用基于多头注意力机制的信息传递实现两种节点的交互.通过该模型与其他实体关系抽取在两个公开数据集上实验对比,所提模型取得了预期效果,在多种情境下普遍优于对比模型.
Abstract
In the era of information technology,extracting structured information from massive natural language texts has become a research hotspot.The complex knowledge information in the power system needs to be solved by constructing a knowledge graph,and entity relation extraction is the upstream information extraction task,whose completeness directly affects the effective-ness of the knowledge graph.With the continuous development of deep learning,research on using deep learning techniques to solve entity relation extraction tasks has gradually been carried out and achieved good results.However,there are still problems such as incomplete application of text semantics.This paper attempts to propose an entity relation extraction method based on heterogeneous graph neural network and text semantic enhancement to address these issues.This method uses word nodes and re-lationship nodes to learn semantic features and obtains initial features of the two types of nodes through BRET and pre-training tasks respectively.It uses a multi-layer graph network structure for iteration and implements the interaction between the two types of nodes by using multi-head attention mechanism for information transmission in each layer.Through experimental com-parison with other models on two public datasets,this model achieves the expected effect and generally outperforms other entity relationship extraction models in various scenarios.
关键词
深度学习/自然语言处理/知识图谱/实体关系抽取/异构图神经网络/文本语义增强Key words
Deep learning/Natural language processing/Knowledge graph/Entity relation extraction/Heterogeneous graph neural networks/Text semantic enhancement引用本文复制引用
基金项目
中国南方电网有限责任公司科技项目(037700KK52220042)
中国南方电网有限责任公司科技项目(GDKJXM20220906)
出版年
2024