首页|多源异构数据渐进式融合的虚假新闻检测

多源异构数据渐进式融合的虚假新闻检测

扫码查看
社交媒体平台上充斥着大量未经验证的信息,这些信息大多为不同来源的异构数据,其传播范围之广、速度之快,对个人和社会造成了严重危害.因此,有效检测和防范虚假新闻至关重要.针对当前虚假新闻检测模型局限于从单一数据来源获取新闻文本及视觉信息,导致新闻报道主观性较强、数据覆盖不全面的问题,提出了一种多源异构数据渐进式融合的虚假新闻检测模型.首先,进行多源异构数据的收集、筛选和清洗,由此构建了一个多源多模态数据集,其中包含关于每个事件的多个不同角度的报道;接着,通过将文本特征提取器和视觉特征提取器获取的特征输入多源融合模块,实现了不同来源特征之间的渐进式融合;同时,引人文本的情感特征和图像的频域特征,以实现多层次的特征提取;最后,采用软注意力机制进行特征集成.实验结果和分析表明,与已有的流行方法相比,所提模型有较好的检测效果,为大数据时代的虚假新闻检测提供了一种有效的解决方案.
Multi-source Heterogeneous Data Progressive Fusion for Fake News Detection
Social media platforms are inundated with a vast amount of unverified information,much of which originates from he-terogeneous data from multi-source,which spreads so widely and quickly that it poses a significant threat to individuals and socie-ty.Therefore,it is crucial to effectively detect and prevent fake news.Targeting the current limitations of fake news detection models,which typically rely on single data sources for news textual and visual information,resulting in strong subjective news re-ports and incomplete data coverage,a model is proposed for detecting fake news by progressively fusing multi-source heteroge-neous data.Firstly,multi-source heterogeneous data collection,screening,and cleaning are conducted to create a multi-source mul-timodal dataset containing reports about each event from diverse perspectives.Next,by inputting the features obtained from the textual feature extractor and visual feature extractor into the multi-source fusion module,a progressive fusion of features from va-rious sources is achieved.Additionally,sentiment features extracted from text and frequency domain features extracted from ima-ges are incorporated into the model to enable multi-level feature extraction.Finally,this paper adopts the soft attention mecha-nism for feature integration.Experimental results and analysis show that the proposed model has better detection performance compared to existing popular methods,providing an effective solution for fake news detection in the era of big data.

Fake news detectionData augmentationMulti-source heterogeneous dataFeature fusionSentiment featureFre-quency domain feature

于泳欣、纪科、高源、陈贞翔、马坤、赵晓凡

展开 >

济南大学信息科学与工程学院 济南 250022

山东省网络环境智能计算技术重点实验室 济南 250022

中国人民公安大学信息网络安全学院 北京 102623

安全防范技术与风险评估公安部重点实验室 北京 102623

展开 >

虚假新闻检测 数据扩增 多源异构数据 特征融合 情感特征 频域特征

山东省重点研发计划山东省重点研发计划山东省自然科学基金

2021CXGC0101032018CXGC0706ZR2022LZH016

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(11)