首页|子图增强的实时同名消歧

子图增强的实时同名消歧

Real-time Name Disambiguation with Subgraph Enhancement

扫码查看
实时同名消歧旨在实时、准确地将具有歧义的作者姓名的新增论文关联到同名候选作者中的正确作者.当前同名消歧算法主要解决冷启动同名消歧问题,较少探索如何高效并有效地解决实时同名消歧问题.该文提出了子图增强的实时同名消歧模型 RND-all,该模型通过高效地融合待消歧论文与候选作者之间的结构特征来提升模型的准确率.模型根据待消歧论文的属性与同名候选作者的档案分别构建子图,使用子图结构特征提取框架来计算图相关性特征,最后,通过特征工程以及文本嵌入方法计算语义匹配特征,并利用集成学习实现语义信息与结构信息的融合.实验结果表明,融入结构信息能够有效提升实时同名消歧任务的准确性,RND-all在百万级同名消歧基准 WhoIsWho测试集上效果排名第一.
Real-time name disambiguation aims to accurately associate new papers to the correct author among same-name candidates in real-time.This paper proposes a subgraph-enhanced real-time name disambiguation model,RND-all,which uses the structural features between the disambiguation paper and the candidate authors to improve the accuracy.In this model,we construct subgraphs based on the attributes of the paper to be disambiguated and the profiles of the candidate authors with the same name,respectively.Then a subgraph structure feature extraction framework is established to calculate graph-correlation features.Finally,the ensemble learning is applied to in-tegrate the structural information and the semantic information,which are derived by feature engineering and se-mantic text embedding.Experimental results show that incorporating structural information can effectively improve the accuracy of real-time name disambiguation tasks,and RND-all ranks first on the test set of million-level name disambiguation benchmark WhoIsWho.

real-time name disambiguationgraph neural networkstructural informationensemble learning

韩天翼、程欣宇、张帆进、陈波

展开 >

贵州大学 公共大数据国家重点实验室,贵州 贵阳 550025

贵州大学 文本计算与认知智能教育部工程研究中心,贵州 贵阳 550025

清华大学 计算机科学与技术系,北京 100084

实时同名消歧 图神经网络 结构信息 集成学习

2024

中文信息学报
中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心
影响因子:0.8
ISSN:1003-0077
年,卷(期):2024.38(1)
  • 29