首页|基于距离泛化的二分图(α,β)-core高效分解算法

基于距离泛化的二分图(α,β)-core高效分解算法

扫码查看
(α,β)-core分解作为图数据管理与分析研究中的热点问题,已经被广泛应用于电商欺诈检测和兴趣群组推荐等实际场景中.然而现有(α,β)-core模型在构建时仅考虑顶点距离为1的邻居,难以刻画出二部图社区中的细粒度信息.针对此问题,提出了基于距离泛化的(α,β,h)-core模型,即由二部图中两个不相交的顶点集构成一个最大子图,满足一个集合中的任何一个顶点至少有α个与它的距离不大于h的邻居顶点,另一个集合中的任何一个顶点至少有β个与它的距离不大于h的邻居顶点.通过引入距离为h的邻居,解决了(α,β)-core模型细粒度刻画能力不足的问题.由于新模型需要考虑距离不大于h的邻居,因此(α,β,h)-core分解变得更为困难.为此,提出了基于计算共享的分解策略,据此设计了高效的(α,β,h)-core分解算法,并分析了算法性能.考虑到确定距离不大于h的邻居顶点非常耗时,还提出一种(α,β,h)-core下界以减少重复计算距离不大于h的邻居顶点,进一步提高计算效率.在8个真实图数据上的对比实验结果验证了新模型的有效性和算法的高效性.
Distance-generalized Based(α,β)-core Decomposition on Bipartite Graphs
(α,β)-core decomposition is a fundamental problem in graph analysis,and has been widely adopted for e-commerce fraud detection and interest group recommendation.Nevertheless,(α,β)-core model only considers the distance-1 neighborhood,which makes it unable to provide more fine-grained structure information.Motivated by this,(α,β,h)-core model is proposed in this paper,which requires the vertices in one/another part has at least α other vertices at distance not greater than h within the subgraph.Due to the distance-h neighborhoods being considered,the new model can identify more fine-grained structure informa-tion as verified in our experiments,which also makes the corresponding(α,β,h)-core decomposition challenging.To address this problem,an efficient algorithm based on computation-sharing strategy is proposed and time complexity is analyzed accordingly.As obtaining neighbors within distance h is time-consuming,a lower bound related to(α,β,h)-core is designed to avoid unnecessary distance-h neighbors computation to further improve the computational efficiency.Experimental results on eight real graphs de-monstrate the effectiveness of the proposed model and efficiency of its algorithm.

Bipartitegraphs(α,β,h)-core decompositionEfficient algorithm

张毅豪、华征宇、袁龙、张帆、王凯、陈紫

展开 >

南京理工大学计算机科学与工程学院 南京 210094

广州大学网络空间安全学院 广州 510555

上海交通大学安泰经济与管理学院 上海 200030

南京航空航天大学计算机科学与技术学院 南京 211106

展开 >

二部图 (α,β,h)-core分解 高效算法

国家重点研发计划信息系统工程重点实验室项目国家自然科学基金青年科学基金

2022YFF0712100WDZC20205250411NS-FC61902184

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(11)