计算机科学2024,Vol.51Issue(11) :30-38.DOI:10.11896/jsjkx.240700004

多源异构数据渐进式融合的虚假新闻检测

Multi-source Heterogeneous Data Progressive Fusion for Fake News Detection

于泳欣 纪科 高源 陈贞翔 马坤 赵晓凡
计算机科学2024,Vol.51Issue(11) :30-38.DOI:10.11896/jsjkx.240700004

多源异构数据渐进式融合的虚假新闻检测

Multi-source Heterogeneous Data Progressive Fusion for Fake News Detection

于泳欣 1纪科 1高源 1陈贞翔 1马坤 1赵晓凡2
扫码查看

作者信息

  • 1. 济南大学信息科学与工程学院 济南 250022;山东省网络环境智能计算技术重点实验室 济南 250022
  • 2. 中国人民公安大学信息网络安全学院 北京 102623;安全防范技术与风险评估公安部重点实验室 北京 102623
  • 折叠

摘要

社交媒体平台上充斥着大量未经验证的信息,这些信息大多为不同来源的异构数据,其传播范围之广、速度之快,对个人和社会造成了严重危害.因此,有效检测和防范虚假新闻至关重要.针对当前虚假新闻检测模型局限于从单一数据来源获取新闻文本及视觉信息,导致新闻报道主观性较强、数据覆盖不全面的问题,提出了一种多源异构数据渐进式融合的虚假新闻检测模型.首先,进行多源异构数据的收集、筛选和清洗,由此构建了一个多源多模态数据集,其中包含关于每个事件的多个不同角度的报道;接着,通过将文本特征提取器和视觉特征提取器获取的特征输入多源融合模块,实现了不同来源特征之间的渐进式融合;同时,引人文本的情感特征和图像的频域特征,以实现多层次的特征提取;最后,采用软注意力机制进行特征集成.实验结果和分析表明,与已有的流行方法相比,所提模型有较好的检测效果,为大数据时代的虚假新闻检测提供了一种有效的解决方案.

Abstract

Social media platforms are inundated with a vast amount of unverified information,much of which originates from he-terogeneous data from multi-source,which spreads so widely and quickly that it poses a significant threat to individuals and socie-ty.Therefore,it is crucial to effectively detect and prevent fake news.Targeting the current limitations of fake news detection models,which typically rely on single data sources for news textual and visual information,resulting in strong subjective news re-ports and incomplete data coverage,a model is proposed for detecting fake news by progressively fusing multi-source heteroge-neous data.Firstly,multi-source heterogeneous data collection,screening,and cleaning are conducted to create a multi-source mul-timodal dataset containing reports about each event from diverse perspectives.Next,by inputting the features obtained from the textual feature extractor and visual feature extractor into the multi-source fusion module,a progressive fusion of features from va-rious sources is achieved.Additionally,sentiment features extracted from text and frequency domain features extracted from ima-ges are incorporated into the model to enable multi-level feature extraction.Finally,this paper adopts the soft attention mecha-nism for feature integration.Experimental results and analysis show that the proposed model has better detection performance compared to existing popular methods,providing an effective solution for fake news detection in the era of big data.

关键词

虚假新闻检测/数据扩增/多源异构数据/特征融合/情感特征/频域特征

Key words

Fake news detection/Data augmentation/Multi-source heterogeneous data/Feature fusion/Sentiment feature/Fre-quency domain feature

引用本文复制引用

基金项目

山东省重点研发计划(2021CXGC010103)

山东省重点研发计划(2018CXGC0706)

山东省自然科学基金(ZR2022LZH016)

出版年

2024
计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCDCSCD北大核心
影响因子:0.944
ISSN:1002-137X
参考文献量34
段落导航相关论文