首页|基于GPT-4数据增强与对比学习的多模态谣言检测研究

基于GPT-4数据增强与对比学习的多模态谣言检测研究

扫码查看
[目的/意义]针对目前多模态谣言检测领域数据增强方法语义准确性与多样性受限的问题,探讨提升多模态谣言检测准确率的模型及方法,有助于进行网络谣言检测和进一步提高网络信息治理能力.[方法/过程]提出一种提升多模态谣言检测准确率的模型TARD-GPT-4.该模型调用GPT-4进行数据增强,采用BERT与ViT模型分别获取模态特征,并使用一种有监督的对比学习策略发掘数据的标签属性特征,最后使用全连接层进行谣言检测判别.[结果/结论]提示大语言模型使用重新描述的方式进行数据增强并融合有监督对比学习的方法对多模态谣言检测准确率的提升有着正向效果.相较于所选取的最优基线模型,TARD-GPT-4在多模态谣言检测准确率上高出1.62%;同时,通过探究多种数据增强方式对结果的影响,发现提示大语言模型使用重新描述的增强方式效果最优.
Research on Multimodal Rumor Detection Based on GPT-4 Text Augmentation and Contrastive Learning
[Purpose/Significance]To enhance the semantic accuracy and diversity in data augmentation methods for multimodal rumor detection,exploring models and methods that have the potential to enhance the detection performance can contribute to the identification of online rumors,as well as to the reinforcement of network information governance capabilities.[Method/Process]A multimodal rumor detection model named TARD-GPT-4 was proposed,which leveraged GPT-4 for data augmentation.The model employed BERT and ViT models to extract textual and visual features,respectively.A supervised contrastive learning strategy was used to further explore the label attribute features.Finally,a full connected layer was used for rumor detection discrimi-nation.[Result/Conclusion]Incorporating supervised contrastive learning and prompting large language models using rephrasing method to augment data have a positive effect on improving the accuracy of multimodal rumor detection.Compared to the optimal baseline model,TARD-GPT-4 achieves a 1.62%higher accuracy in multi-modal rumor detection.The experimental part also investigates the impact of various data augmentation methods and finds that prompting LLMs for paraphrasing yields the most favorable results.

data augmentationcontrastive learningmultimodalrumor detection

蒋超、朱学芳

展开 >

南京大学信息管理学院 南京 210023

南京大学江苏省数据工程与知识服务重点实验室 南京 210023

数据增强 对比学习 多模态 谣言检测

2024

图书情报工作
中国科学院文献情报中心

图书情报工作

CSTPCDCSSCICHSSCD北大核心
影响因子:2.203
ISSN:0252-3116
年,卷(期):2024.68(23)