首页|一种基于变分多跳图注意力编码器的深层协同真值发现

一种基于变分多跳图注意力编码器的深层协同真值发现

扫码查看
大数据时代,数据价值的释放经常需要融合多源数据,数据冲突成为这一过程中无法避免的关键问题.为了从冲突数据中筛选出真实声明以及可靠数据源,研究人员提出了真值发现方法.然而,现有的真值发现大多注重数据源与声明之间的直接协同信息,忽略了更深层的间接协同与对抗信息,导致不足以表达出数据源与声明的特征.针对此问题,提出了基于变分多跳图注意力编码器的真值发现方法(TD-VMGAE),基于数据源与声明之间的包含关系构建二分图网络,采用多跳图注意力层为每个节点表征汇聚间接协同信息以及对抗信息,并设计真值发现变分自编码器,抽取节点表征中所需的分类分布,对数据源和声明进行协同分类.实验结果表明,所提方法在3个不同尺度的数据集中均有不错的表现,消融实验和可视化也验证了所提方法的有效性和泛化能力.
Deep Collaborative Truth Discovery Based on Variational Multi-hop Graph Attention Encoder
In the era of big data,the release of data value often requires the fusion of multi-source data,and data conflict has be-come an inevitable key problem in this process.In order to filter out true claims and reliable sources from conflicting data,re-searchers have proposed truth discovery methods.However,the existing truth discovery methods pay more attention to the direct collaborative information between sources and claims,and ignore the deeper indirect collaborative and confrontational informa-tion,which is insufficient to express the characteristics of sources and claims.To solve this problem,this paper proposes a truth discovery method based on variational multi-hop graph attention encoder(TD-VMGAE).It constructs a bipartite graph network based on the inclusion relationship between sources and claims,uses a multi-hop graph attention layer to gather indirect coopera-tive information and antagonistic information for of each node,and a truth discovery variational auto-encoder is designed to ex-tract the categorical distribution required in node characterization,and collaborative classification of data sources and claims is carried out.Experiments show that the proposed method has good performance in three datasets with different scales,and the ef-fectiveness and generalization ability of the method are verified by ablation experiments and visualization.

Data qualityConflict resolutionTruth discoveryMulti-hop attention graph neural networkVariational auto-encoder

张国昊、王轶、周喜、王保全

展开 >

中国科学院新疆理化技术研究所 乌鲁木齐 830011

中国科学院大学 北京 100049

新疆民族语音语言信息处理实验室 乌鲁木齐 830011

数据质量 冲突消解 真值发现 多跳图注意力 变分自编码器

新疆维吾尔自治区重点实验室开放课题新疆自然科学基金杰出青年基金新疆维吾尔自治区自然科学基金中科院青年创新促进会项目

2020D040502022D01E042022D01B672021434

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(3)
  • 38