首页|基于特征插值的深度图对比聚类算法

基于特征插值的深度图对比聚类算法

扫码查看
Mixup是图像领域中一种有效的数据增强方法,它通过对输入图像以及标签进行插值来合成新的样本进而扩大训练分布.然而,在图节点聚类任务中,由于图数据拓扑结构的不规则性和连通性以及无监督的场景,设计有效的插值方法成为一项具有挑战性的任务.为了解决上述问题,首先通过设计不共享参数的编码器来获取视图的嵌入特征,有效融合节点的特征和结构信息.然后将视图的嵌入特征及其对应的伪标签进行混合插值,从而将Mixup引入聚类任务中.为了确保伪标签的可靠性,设置了阈值来筛选高置信度的伪标签,并通过EMA的方式更新模型参数,使模型平稳优化的同时考虑了训练的历史信息.此外,设计了一个图对比学习模块,以保证特征在不同视图下的一致性,从而减少信息冗余,提高模型的判别能力.最终,通过在6个数据集上的大量实验证明了所提方法的有效性.
Feature Interpolation Based Deep Graph Contrastive Clustering Algorithm
Mixup is an effective data augmentation technique in the field of computer vision.It is widely used for expanding the training distribution by interpolating input images and labels to generate new samples.However,in the context of graph node clustering tasks,designing robust interpolation methods poses challenges due to the irregularity and connectivity of graph data,as well as the unsupervised nature of the problem.To address these challenges,we propose a novel approach that leverages a dedica-ted encoder with non-shared parameters to extract embedding features from different views of graph.This allows us to effectively integrate both the node features and structural information.We then introduce Mixup into the clustering task by performing mixed interpolation on the embedding features along with their corresponding pseudo-labels.To ensure the reliability of these pseudo-labels,we apply a threshold to filter out high-confidence predictions,while incorporating an exponential moving average(EMA)mechanism for updating model parameters and considering the historical information during training.Furthermore,we in-corporate a graph contrastive learning module to enhance feature consistency across different views,reducing information redun-dancy and improving the discriminative power of the model.Extensive experiments on six datasets demonstrate the effectiveness of the proposed method.

Data augmentationGraph contrastive clusteringEMAMixupGraph neural network

杨希洪、郑群、章佳欣、王沛、祝恩

展开 >

国防科技大学计算机学院 长沙 410073

中国科学技术大学地球和空间科学学院 合肥 230001

数据增强 图对比聚类 EMA Mixup 图神经网络

国家科技重大专项

2022ZD0209103

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(11)