系统科学与数学2024,Vol.44Issue(6) :1534-1549.DOI:10.12341/jssmsKSS23868

基于主题过滤和因果推断的社交媒体转发机制分析

Analysis of the Retweet Mechanism of Social Media—Based on Topic Filtering and Causal Inference

黄晓辉 闫志华 唐锡晋
系统科学与数学2024,Vol.44Issue(6) :1534-1549.DOI:10.12341/jssmsKSS23868

基于主题过滤和因果推断的社交媒体转发机制分析

Analysis of the Retweet Mechanism of Social Media—Based on Topic Filtering and Causal Inference

黄晓辉 1闫志华 2唐锡晋1
扫码查看

作者信息

  • 1. 中国科学院数学与系统科学研究院,北京 100190;中国科学院大学,北京 100049
  • 2. 山西财经大学管理科学与工程学院,太原 030006
  • 折叠

摘要

知悉社交媒体平台中影响信息转发的主要因素对于危害信息扩散的管控有着重要的意义,以往研究大多基于回归分析的方法,挖掘对转发数存在显著影响的变量,在可解释性上存在不足.文章基于统计回归建模和因果推断方法,从用户特征和文本特征分析影响推文转发的变量,并生成解释文本情绪与转发数因果关系的剂量-效应函数.此外,考虑到社交媒体观测数据集采集时存在收集偏差的问题,文章使用主题聚类方法进行数据过滤.在对疫苗讨论和总统选举Twitter数据集的实证分析中,该研究发现了显著影响信息转发的变量集,以及文本情绪对于转发数的因果效应.

Abstract

Recognizing the primary factors that influence information diffusion on social media platforms holds significant importance in the containment of harmful information spread.Previous research has primarily utilized regression analysis to identify variables that have a significant impact on retweets.However,these ap-proaches have been limited in terms of interpretability.Using statistical modeling and causal inference,this study analyzes the variables that affect retweets from user and text features.Subsequently,the dose-response function is generated to elucidate the causal relationship of the text sentiment to retweets.Additionally,considering the potential collection bias in observed social media datasets,this study uses topical clustering for data filtration.In the experimental analysis of Twitter dataset related to the Vaccine discussion and presidential election,we have identified the variables that impact the retweets,and investigated the causal impact of text sentiment to retweets.

关键词

因果推断/主题过滤/泊松回归/信息扩散

Key words

Causal inference/topic filtering/Poisson regression/information diffu-sion

引用本文复制引用

基金项目

国家自然科学基金(71971190)

出版年

2024
系统科学与数学
中国科学院数学与系统科学研究院

系统科学与数学

CSTPCD北大核心
影响因子:0.425
ISSN:1000-0577
参考文献量55
段落导航相关论文