首页|基于主动学习的实体关系抽取的方法研究

基于主动学习的实体关系抽取的方法研究

扫码查看
关系分类是NLP中提取实体间关系的一项重要任务.介绍一种用于大规模的中文信息抽取数据集的方法,该方法将BERT合并到一个新的框架,并将主动学习应用于联合实体关系抽取中.这种模型从四个方面完善了现有的方法.第一,可以解决多个实体属于多个三元组的问题.基于概率图的思想设计了该框架,并研究出一种新的"头尾"标记方法;第二,提出了一种将主动学习应用于关系抽取问题的创新方法;第三,为了在主、谓、宾三种实体之间传输信息,提出了一种新的规范化方法,称为条件层规范化;第四,设计了一个新的损失函数,以避免类不平衡.实验证明,增强了模型的信息提取能力,在单个模型的测试集上的F1-score达到0.840,在用完整数据训练的情况下与原始深度模型相比,用更少的数据取得了更好的性能.
Research on entity relation extraction based on active learning
Relation classification is an important NLP task to extract relations between entities.In this paper,we report our method for a largest schema-based Chinese information extraction dataset.We incorporate BERT into a new framework and apply active learning for joint entity relation extraction.This model extends existing approaches from three perspectives.First,our method could solve the problem that multiple entities belongs to multiple triplets.We design this framework based on the idea of probabil-ity graph and develop a new"head-tail"labeling method.Second,we proposed an innovative approach that apply active learning on relation extraction problem.Third to transmit information between subject entities,predicate and object entity,we propose a new normalization method called conditional layer normalization.fourthly,a new loss function is designed to avoid class imbalance.Therefore,we enhance the information extracting ability of the model and achieve F1-score 0.840 on test set with a single model,and achieve better performance with much less data than the original deep models trained by full data.

BERTactive learningjoint entity relationship extraction

孙涵

展开 >

太原开放大学网络服务中心,太原 030027

BERT 主动学习 联合实体关系抽取

山西省教育科学规划课题(十四五)山西省现代远程教育学会项目(2024)

GH-21105SXYJ202403

2024

现代计算机
中大控股

现代计算机

影响因子:0.292
ISSN:1007-1423
年,卷(期):2024.30(8)
  • 39