限定域关系抽取技术研究综述

Survey on Domain Limited Relation Extraction

侯景 ¹邓晓梅 ²汉鹏武²

扫码查看

作者信息

1. 中国科学院空间应用工程与技术中心北京 100094;中国科学院大学北京 100094
2. 中国科学院空间应用工程与技术中心北京 100094
折叠

摘要

限定域关系抽取技术是在预定义实体类型和关系类型的前提下,从文本中捕获关键信息的技术,多采用由头尾实体和关系构成的三元组作为信息表示形式.作为信息抽取领域的重要研究方向之一,其在知识问答、信息检索等任务中被广泛应用.文中在介绍相关概念和任务范式的基础上,分析了深度学习背景下限定域关系抽取任务的研究进展,根据句中实体是否可见,分为关系分类任务和三元组抽取任务,依据任务表现特征,前者可细分为有监督条件下的关系分类任务、小样本关系分类任务和远程监督条件下的关系分类任务.文中探讨和分析了以上任务中常用的技术方法及其优缺点,最后归纳总结了关系抽取技术在低资源、多模态等更为接近真实情景下的发展潜力和现存的挑战.

Abstract

Domain-limited relation extraction aims to capture essential text information from the text under the premise of prede-fined entity types and relation types,and mostly uses triples composed of head and tail entities and relations as structured infor-mation representation.As one of the important tasks of information extraction,it plays an important role in question answering and information retrieval.Based on its concepts and task paradigms,this paper systematically sorts out the technical methods in domain-limited relation extraction under the background of deep learning.Whether the entity is visible or not,it is divided into re-lation classification and triplet extraction.According to the performance characteristics of the task,the former can be divided into relation classification under supervised conditions,few-shot relation classification,and relation classification under distant supervi-sion.This paper discusses and analyzes the commonly used technical methods and their advantages and disadvantages in the above tasks.Finally,we summarize the development potential and existing challenges of relation extraction technology in low-resource,multimodal and other situations that are closer to the real world.

关键词

限定域关系抽取/深度学习/关系分类/三元组/远程监督

Key words

Domain-limited/Deep learning/Relation classification/Triples/Distant supervision

引用本文复制引用

出版年

2024

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCDCSCD北大核心

影响因子：0.944

ISSN：1002-137X

参考文献量109

段落导航