智能系统学报2024,Vol.19Issue(3) :697-706.DOI:10.11992/tis.202205034

基于自适应损失函数的句子级远程监督关系抽取

Sentence-level distant supervision relation extraction based on self-adaptive loss function

胡峰 杨新瑞 汤成富 邓维斌 刘群
智能系统学报2024,Vol.19Issue(3) :697-706.DOI:10.11992/tis.202205034

基于自适应损失函数的句子级远程监督关系抽取

Sentence-level distant supervision relation extraction based on self-adaptive loss function

胡峰 1杨新瑞 1汤成富 1邓维斌 1刘群1
扫码查看

作者信息

  • 1. 重庆邮电大学 计算智能重庆市重点实验室,重庆 400065
  • 折叠

摘要

远程监督关系抽取是一种关系抽取方法,现有方法主要采用多实例学习,在具有相同实体对的样例包上进行关系抽取.但是,包级方法只能缓解却并不能完全解决错误标签问题.基于此,文中首先分析了干净数据和噪声数据的分布,提出了一种新的自适应损失函数;在此基础上,提出了一种基于自适应损失函数的句子级远程监督关系抽取方法.在公开数据集NYT-10 以及基于TACRED的合成数据集上的实验结果表明:文中提出的方法优于对比文献中的方法,能够更有效地区分错误标签噪声样例和干净样例,提高了句子级远程监督关系抽取的准确率.

Abstract

Distant supervision relation extraction is a kind of relation extraction method.The existing methods,which mainly employ multi-instance learning and relation extraction,are conducted in the sample bag that contains the same entity pair.However,the bag-level method can only alleviate but cannot completely solve the problem of wrong la-beling.Therefore,herein,the distribution of clean data and noise data is analyzed,proposing a new self-adaptive loss function.On this basis,a method for sentence-level distant supervision relation extraction based on self-adaptive loss function is given.The experimental results obtained on the public dataset NYT-10 and the TACRED-based synthetic dataset show that the proposed method is better than that given in the compared studies.It can distinguish the wrongly labeled noise samples from the clean samples more effectively,improving the accuracy of sentence-level distant super-vision relation extraction.

关键词

自然语言处理/信息抽取/关系抽取/远程监督/噪声分离/噪声标注/负训练/自适应损失函数

Key words

natural language processing/information extraction/relation extraction/distant supervision/noise separa-tion/noise label/negative training/self-adaptive loss function

引用本文复制引用

基金项目

国家重点研发计划(2018YFC0832102)

重庆市教委重点合作项目(HZ2021008)

重庆市自然科学基金(cstc2021jcyjmsxmX0849)

出版年

2024
智能系统学报
中国人工智能学会 哈尔滨工程大学

智能系统学报

CSTPCD北大核心
影响因子:0.672
ISSN:1673-4785
参考文献量3
段落导航相关论文