基于置信度的知识图谱内部长尾噪音检测

Confidence-based internal long-tail noise detection in knowledge graphs

鲍忠将 ¹李学俊 ¹廖竞¹

扫码查看

作者信息

1. 西南科技大学计算机科学技术学院,四川绵阳 621000
折叠

摘要

提出一种置信度模型,量化判别三元组的准确度.该模型包含3个方面:知识表示中的实体强度关联计算;语义识别中的长尾特征计算;实体环境结构评估的置信度计算.最终目标是利用置信度检测知识图中存在的长尾噪声.在真实世界数据集FB15K上,实验验证了噪音数据集构造的合理性.长尾噪音检测实验中,验证了该模型的优越性.在阈值实验中,其噪音识别准确率稳定在90％以上.实验结果表明,与其它模型相比,该模型取得了显著且一致的改进.

Abstract

A trustworthiness model was proposed to quantify the accuracy of discriminative triplets.The model consisted of three aspects,entity strength association calculation in knowledge representation,long-tail feature calculation in semantic recognition and trustworthiness calculation in entity environment structure assessment.The ultimate goal was to detect the long-tail noise present in the knowledge graph with trustworthiness.On the real-world dataset FB15K,experiments verified the rationality of the noisy dataset construction.The superiority of the model was demonstrated in the long tail noise detection experiment.Its noise recognition accuracy is stable above 90％in threshold experiments.Experimental results show that the model achieves sig-nificant and consistent improvements over other models.

关键词

知识图谱/噪音检测/置信度/长尾噪音/实体关联强度/知识图谱内部噪音/长尾路径

Key words

knowledge graph/noise detection/trustworthiness/long tail noise/associative strength/knowledge graph internal noise/long-tail paths

引用本文复制引用

基金项目

国防基础科研项目(JCKY2019204B007)

国家自然科学基金(6187230)

出版年

2024

计算机工程与设计

中国航天科工集团二院706所

计算机工程与设计

CSTPCD北大核心

影响因子：0.617

ISSN：1000-7024

参考文献量23

段落导航