信息安全学报2024,Vol.9Issue(2) :19-35.DOI:10.19363/J.cnki.cn10-1380/tn.2024.03.02

基于粗糙集的不完备谣言信息系统的知识获取与决策

Knowledge Acquisition and Decision Making in Incom-plete Rumor Information System based on Rough Set

王标 卫红权 王凯 刘树新 江昊聪
信息安全学报2024,Vol.9Issue(2) :19-35.DOI:10.19363/J.cnki.cn10-1380/tn.2024.03.02

基于粗糙集的不完备谣言信息系统的知识获取与决策

Knowledge Acquisition and Decision Making in Incom-plete Rumor Information System based on Rough Set

王标 1卫红权 2王凯 2刘树新 2江昊聪2
扫码查看

作者信息

  • 1. 中国人民解放军战略支援部队信息工程大学 郑州 中国 450001
  • 2. 中国人民解放军战略支援部队信息工程大学 郑州 中国 450001;国家数字交换系统工程技术研究中心 郑州 中国 450002
  • 折叠

摘要

网络谣言可能扰乱人们的思想、心理和行为,引发社会震荡、危害公共安全,而微博等社交平台的广泛应用使得谣言造成的影响与危害变得更大,因此,谣言检测对于网络空间的有序健康发展具有重要的意义.当前谣言的自动检测技术更多关注检测模型的构建和输入数据的表现形式,而在改善数据质量以提高谣言识别效果方面的研究很少.基于此,本文将粗糙集理论应用于不完备谣言信息系统进行知识获取与决策,实质上是通过粗糙集理论解决不完备谣言信息系统的不确定性度量,冗余性以及不完备性等问题,以获得高质量的数据,改善谣言检测效果.首先系统总结了粗糙集理论中不确定性度量的方法,包括香农熵、粗糙熵、Liang 熵以及信息粒度等四种不确定度量方法,并整理和推导了这四种不确定度量方法从完备信息系统到不完备信息系统的一致性拓展.基于上述总结的四种不确定度量方法,提出了基于最大相关最小冗余(MCMR,Maximum Correlation Minimum Redundancy)的知识约简算法.该方法基于熵度量方式,能够综合考量决策信息与冗余噪音,在UCI及Weibo等 8 个数据集上实验验证,结果表明本文算法优于几种基线算法,能够有效解决信息系统的冗余性.另外,提出了一种基于极大相容块的不完备决策树算法,在不同缺失程度数据上实验验证,结果表明本文算法能够有效解决信息系统的不完备性.

Abstract

Online rumors may disrupt people's thoughts,psychology and behavior,cause social shocks and endanger public safety.The widespread use of social platforms such as Weibo makes the impact and harm caused by rumors even greater.Therefore,rumor detection is of great significance to the orderly and healthy development of cyberspace.The current automatic detection techniques for rumors focus more on the construction of detection models and the represen-tation of input data,while there is little research on improving the quality of data to improve the effect of rumor detec-tion.Based on this idea,this paper applies the rough set theory to the incomplete rumor information system for knowl-edge acquisition and decision-making.In essence,to obtain high-quality data and improve rumor detection,the rough set theory is used to solve the uncertainty measurement,redundancy,and incompleteness of the incomplete rumor in-formation system.Firstly,it systematically summarizes the methods of uncertainty measurement in rough set theory,including four uncertainty measurement methods such as Shannon entropy,rough entropy,Liang entropy,and informa-tion granularity,and organizes and derives the consistent expansion of the four uncertainty measurement methods from complete information system to incomplete information system.Based on the four uncertainty measurement methods summarized above,a knowledge reduction algorithm based on Maximum Correlation Minimum Redundancy(MCMR)is proposed.The method is based on entropy measurement,which can comprehensively consider decision information and redundant noise.Experiments on 8 data sets such as UCI and Weibo show that the algorithm in this paper is supe-rior to several baseline algorithms and can effectively solve the redundancy of the information system.In addition,this paper proposes an incomplete decision tree algorithm based on maximal consistent blocks.Experiments on data with different degrees of missingness show that the algorithm in this paper can effectively solve the incompleteness of the information system.

关键词

谣言检测/粗糙集/不完备信息系统/最大相关最小冗余/极大相容块

Key words

rumor detection/rough set/incomplete information system/maximum correlation minimum redundancy/maximal consistent blocks

引用本文复制引用

基金项目

中原英才计划(212101510002)

出版年

2024
信息安全学报

信息安全学报

CSTPCD
ISSN:
浏览量1
参考文献量64
段落导航相关论文