融合证据句子提取的文档级关系抽取

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：文档级关系抽取作为自然语言处理领域的一个关键任务,旨在从长文档中准确抽取实体对之间的语义关系.传统的文档级关系抽取方法通常将整个文档作为输入,但事实上,人类只需根据文档中的部分句子即可预测实体对的关系,即证据句子.在现有研究中,很多研究方法都利用了证据句子,但是都存在无法找全以及很难充分利用这些证据句子的优势等问题.针对该问题,引入更加高效且准确的证据句子选取方法,通过融合公式法和删句法的证据句子提取策略,并将证据提取与训练推理过程相融合,使得文档级关系抽取模型更加关注重要的句子,同时仍可以识别文档中的完整信息.实验表明,改进后的模型在公共数据集上的表现优于已有模型.

外文标题：Document-level Relation Extraction Integrating Evidence Sentence Extraction

外文摘要：As a crucial task in the field of natural language processing,document-level relation extraction aims to accurately ex-tract semantic relationships between entities from lengthy documents.Traditional document-level relation extraction methods ty-pically take the entire document as input.However,in reality,humans can predict relationships between entity pairs based on only a portion of the document,referred to as evidence sentences.In existing research,many methods start to utilize evidence sen-tences,but they face challenges such as incomplete evidence retrieval and difficulty in fully leveraging the advantages of these evi-dence sentences.To address this issue,we introduce a more efficient and accurate evidence sentence selection method.This is achieved by integrating a strategy for extracting evidence sentences through a fusion of formula-based and sentence-deletion-based approaches.We seamlessly integrate the evidence extraction with the training and inference processes,directing the document-le-vel relation extraction model to focus more on crucial sentences while still recognizing comprehensive information within the doc-ument.Experimental results demonstrate that the improved model outperforms existing models on public datasets.

外文关键词：

Document-levelRelation extractionEvidence sentencesBilinear layer

作者：

安先跨、肖蓉、杨肖

展开 >

作者单位：

湖北大学计算机与信息工程学院武汉 430062

关键词：

文档级关系抽取证据句子双线性层

基金：

科技大数据湖北省重点实验室(中国科学院武汉文献情报中心)开放基金

项目编号：

E1KF291005

出版年：

2024

DOI：

10.11896/jsjkx.230800081

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCD北大核心

影响因子：0.944

ISSN：1002-137X

年,卷(期)：2024.51(z1)

参考文献量21