首页|MHGC: Multi-scale hard sample mining for contrastive deep graph clustering

MHGC: Multi-scale hard sample mining for contrastive deep graph clustering

扫码查看
Contrastive graph clustering holds significant importance for numerous real-world applications and yields encouraging performance. However, current efforts often overlook hierarchical high-order semantic information and treat all contrastive pairs equally during optimization. Consequently, the abundance of well sample pairs overwhelms the critical structural context learning process, limiting the accumulation of information and deteriorating the network's learning capability. To address this concern, a novel contrastive deep graph clustering method termed MHGC is proposed by conducting hard sample mining in contrastive learning with multi-granularity. Specifically, random walk with restart is utilized to sample subgraphs centered around anchor nodes. Then, an attribute encoder to learn node representations is designed to obtain subgraph embeddings. Subsequently, hard and easy sample pairs within high-confidence clusters is identified by applying a two-component beta mixture model to the clustering loss. Building upon this, a weight regulator is then elaborated to adaptively tune the weights of sample pairs and a multi-scale contrastive loss framework is proposed to leverage structural context information in a hierarchical contrastive manner. Comprehensive experiments conducted on six widely used datasets confirm the comparable performance of our MHGC relative to the state-of-the-art baselines, demonstrating an average increase of 1.54% in accuracy. Additionally, the ablation study further proves that our proposed multi-scale learning scheme and BMM-based hard mining strategy are effective approaches for the graph clustering task. The source code is available at https://github.com/sodarin/MHGC

Deep graph clusteringGraph contrastive learningMulti-scale representationHard sample mining

Tao Ren、Haodong Zhang、Yifan Wang、Wei Ju、Chengwu Liu、Fanchun Meng、Siyu Yi、Xiao Luo

展开 >

Software College, Northeastern University, Shenyang, 110170, China

School of Information Technology & Management, University of International Business and Economics, Beijing, 100029, China

School of Computer Science, Peking University, Beijing, 100871, China

School of Statistics and Data Science, Nankai University, Tianjin, 300071, China

Department of Computer Science, University of California, Los Angeles, 90095, USA

展开 >

2025

Information processing & management

Information processing & management

ISSN:0306-4573
年,卷(期):2025.62(4)
  • 70