基于双标注框架的实体关系联合抽取
Joint extraction of entities and relations based on double-labeled architecture
曾碧卿 1蔡剑 1李砚龙1
作者信息
- 1. 华南师范大学软件学院,广东佛山 528225
- 折叠
摘要
实体关系抽取有流水线和联合抽取两种,联合抽取能更有效地抽取实体关系,流水线的适应能力更灵活.为解决实体关系抽取中的关系重叠问题,提出一种双标注实体关系抽取框架.使用联合解码的方式抽取自然文本中的主体实体,使用流水线方式抽取出客体实体.使用联合解码保证抽取精度的同时继承流水线的灵活性.所提模型在信息抽取数据集DUIE和远程监督数据集NYT上进行实验,其结果表明,该模型与基线模型相比具有竞争力.
Abstract
Relations extraction methods can be divided into two types including pipeline method and joint extraction,and the joint extraction model can extract the relation more effectively,and the adaptability of pipeline is more flexible.To solve the problem of relation overlap in relation extraction,the double-labeled relations extraction framework was proposed.The joint decoding was used to extract the subject entity in the natural text,and the object entity was extracted by pipeline.This technique ensured the extraction accuracy using the joint decoding method,and inherited the flexibility of the pipeline method.The proposed frame-work was experimented on the information extraction dataset DUIE and the remote supervision dataset NYT.The results show that this model can achieve competitive performance compared with the baseline model.
关键词
实体关系抽取/序列标注/联合关系抽取/关系重叠/信息抽取/注意力机制/自然语言处理Key words
entity and relations extraction/sequence tagging/joint extraction/relation overlap/information extraction/attention mechanism/natural language processing(NLP)引用本文复制引用
基金项目
国家自然科学基金面上项目(62076103)
广东省基础与应用基础研究基金(2021A1515011171)
广东省普通高等学校人工智能重点领域专项(2019KZDZX1033)
广州市基础研究计划基础与应用基础研究项目(202102080282)
出版年
2024