限定域关系抽取技术研究综述
Survey on Domain Limited Relation Extraction
侯景 1邓晓梅 2汉鹏武2
作者信息
- 1. 中国科学院空间应用工程与技术中心 北京 100094;中国科学院大学 北京 100094
- 2. 中国科学院空间应用工程与技术中心 北京 100094
- 折叠
摘要
限定域关系抽取技术是在预定义实体类型和关系类型的前提下,从文本中捕获关键信息的技术,多采用由头尾实体和关系构成的三元组作为信息表示形式.作为信息抽取领域的重要研究方向之一,其在知识问答、信息检索等任务中被广泛应用.文中在介绍相关概念和任务范式的基础上,分析了深度学习背景下限定域关系抽取任务的研究进展,根据句中实体是否可见,分为关系分类任务和三元组抽取任务,依据任务表现特征,前者可细分为有监督条件下的关系分类任务、小样本关系分类任务和远程监督条件下的关系分类任务.文中探讨和分析了以上任务中常用的技术方法及其优缺点,最后归纳总结了关系抽取技术在低资源、多模态等更为接近真实情景下的发展潜力和现存的挑战.
Abstract
Domain-limited relation extraction aims to capture essential text information from the text under the premise of prede-fined entity types and relation types,and mostly uses triples composed of head and tail entities and relations as structured infor-mation representation.As one of the important tasks of information extraction,it plays an important role in question answering and information retrieval.Based on its concepts and task paradigms,this paper systematically sorts out the technical methods in domain-limited relation extraction under the background of deep learning.Whether the entity is visible or not,it is divided into re-lation classification and triplet extraction.According to the performance characteristics of the task,the former can be divided into relation classification under supervised conditions,few-shot relation classification,and relation classification under distant supervi-sion.This paper discusses and analyzes the commonly used technical methods and their advantages and disadvantages in the above tasks.Finally,we summarize the development potential and existing challenges of relation extraction technology in low-resource,multimodal and other situations that are closer to the real world.
关键词
限定域关系抽取/深度学习/关系分类/三元组/远程监督Key words
Domain-limited/Deep learning/Relation classification/Triples/Distant supervision引用本文复制引用
出版年
2024