基于代表性节点扩张的保持社区结构的图采样算法
Graph Sampling Algorithm Based on Representative Node Expansion to Maintain Community Structure
宏宇 1陈鸿昶 2张建朋 2黄瑞阳 2李邵梅2
作者信息
- 1. 郑州大学网络空间安全学院 郑州 450000
- 2. 中国人民解放军战略支援部队信息工程大学信息技术研究所 郑州 450000
- 折叠
摘要
作为一种能够简化大规模图并保留其指定属性的方法,图采样被广泛应用于现实生活中.然而当前研究大多集中于保留节点级的性质,如度分布等,而忽略了图的社区结构等更为重要的信息.针对此问题,提出了 一种保持社区结构的图采样算法.算法主要分为两个步骤,第一步为初始化社区代表点,根据提出的节点重要度计算公式算出节点的重要度,然后选出每个社区的代表性节点;第二步为社区结构扩张,针对每个社区,选择可能引入最少额外邻居的节点加入社区中,直到达到该社区节点上限.在多个真实数据集上进行了对比实验,使用多个评价指标来评估实验结果.实验结果表明,所提出的采样算法能够很好地保持原始图的社区结构,为大规模图的社区结构采样提供了可行的解决方案.
Abstract
Graph sampling is widely used in real life as a method to simplify large-scale graphs and retain specified properties.However,most of the current research focuses on preserving node-level properties,such as degree distribution,while ignoring more important information such as the community structure of graphs.To solve this problem,a graph sampling algorithm is pro-posed to maintain the community structure.The algorithm is divided into two steps.The first step is to initialize the community representative points,and the node importance is calculated according to the proposed node importance calculation formula,and then the representative nodes of each community are selected.The second step is to expand the community structure.For each community,it selects the node that can introduce the least additional neighbors to join the community until the upper limit of the community node is reached.Comparative experiments are conducted on a number of real data sets,and multiple evaluation indica-tors are adopted to evaluate the experimental results.Experimental results show that the proposed sampling algorithm can well maintain the overall community structure,and provides a feasible solution for sampling community structure of large-scale graphs.
关键词
图采样/社区结构/代表性节点/扩张/重要度Key words
Graph sampling/Community structure/Representative nodes/Expansion/Importance引用本文复制引用
基金项目
国家自然科学基金(62002384)
中国博士后科学基金面上项目(2020M683760)
嵩山实验室项目(221100210700-03)
出版年
2024