多层次结构与半监督学习的谣言检测研究

Research on rumor detection based on multilevel structure and semi supervised learning

张岩珂 ¹但志平 ¹董方敏 ¹高准 ¹张洪志¹

扫码查看

作者信息

1. 三峡大学水电工程智能视觉监测湖北省重点实验室宜昌 443002;三峡大学计算机与信息学院宜昌 443002
折叠

摘要

当前谣言检测工作主要基于监督学习,需要人为标记数据而导致检测具有滞后性.为了充分利用大量的未标记数据,及时检测社交网络中的虚假谣言.提出了一种基于多层次结构与半监督学习谣言检测模型(multi-level semi spuervised graph convolutional neural network,MSGCN).该模型构建了一种多层次检测模块,基于图卷积网络对有限的标记样本进行训练以提取多层次传播结构特征、扩散结构特征和全局结构特征.其次,引入随机模型扰动集成无标签数据的动态输出进行一致性预测,提出互补伪标签法来获取高质量伪标签数据,并将其加入标记数据扩充样本.最后在有监督交叉熵损失和无监督一致性损失约束下提高模型质量.在公开的Twitter15、Twitter16和 Weibo数据集上的实验结果表明,所提出模型在30%标记样本下准确率达到88.3%、90.1%和95.5%,在少量的标记样本下便可达到优异的成绩.

Abstract

Social media generates a large amount of information,only a small portion of which can be labeled by professionals as true or false rumors.To make full use of the vast amount of unlabeled data and detect false rumors in a timely manner,proposes a model called MSGCN based on multi-level structure and semi supervised learning.This model constructs a multi-level detection module based on graph convolutional neural network to train limited labeled samples to extract multi-level propagation structure features,diffusion structure features,and global structure features.By perturbing the random model and integrating the dynamic output of unlabeled data for consistent prediction,the complementary pseudo label method is used to label the high confidence unlabeled data calculated by the model and add it to the training set to expand the sample.Under supervised cross-entropy loss and unsupervised consistency loss constraints,the model shows excellent performance.The experimental results on public Twitter15,Twitter16,and Weibo datasets show that the proposed model achieves accuracy of 88.3%,90.1%and 95.5%under 30%labeled samples,can achieve excellent performance with a small number of labeled samples.

关键词

谣言检测/半监督/层次结构/伪标签

Key words

rumor detection/semi-supervised/multilevel structure/pseudo label

引用本文复制引用

基金项目

NSFC-新疆联合基金(U1703261)

出版年

2024

国外电子测量技术

北京方略信息科技有限公司

国外电子测量技术

CSTPCD

影响因子：1.414

ISSN：1002-8978

参考文献量25

段落导航