一种基于成员选择的簇加权聚类集成算法
A cluster-weighted clustering ensemble algorithm based on member selection
徐森 1高婷 1徐秀芳 1许贺洋 1郭乃瑄 2卞学胜 1花小朋 1陈致远1
作者信息
- 1. 盐城工学院信息工程学院,江苏盐城 224002
- 2. 盐城工学院信息工程学院,江苏盐城 224002;计算机网络和信息集成教育部重点实验室(东南大学),南京 211189
- 折叠
摘要
聚类集成算法在数据挖掘和模式识别等领域应用广泛.现有的聚类集成算法虽取得了显著的进展,但鲜有同时考虑如何处理冗余成员和关注成员内部多样性的算法.对此,设计一种簇的不确定性度量指标,并提出一种基于成员选择的簇加权聚类集成算法.首先,利用平均差异性度量和筛选聚类成员,并引入信息熵衡量簇的不确定性,给簇赋予相应的权重;其次,在基于成员选择的簇加权共协矩阵和高置信度矩阵的基础上构建增强矩阵;最后,在增强矩阵上执行层次聚类算法得到最终的聚类集成结果.采用多个UCI数据集进行实验,将所提算法与主流的聚类集成算法进行比较,实验结果表明,所提出的算法可以获得更好的聚类集成效果,且具有较高的鲁棒性和稳定性.
Abstract
Clustering ensemble algorithms are widely used in fields such as data mining and pattern recognition.Although the existing clustering ensemble algorithms have made significant progress,few algorithms consider how to deal with redundant members and pay attention to the diversity within members at the same time.In this paper,we design an uncertainty metric for clusters,and propose a cluster-weighted clustering ensemble algorithm based on member selection.Firstly,the average difference is used to measure and screen the cluster members,and the uncertainty of the cluster is measured by information entropy,and the corresponding weight is given to the cluster.Then,the enhanced matrix is constructed on the basis of the cluster-weighted co-association matrix and the high-confidence matrix based on member selection.Finally,the hierarchical clustering algorithm is executed on the enhancement matrix to obtain the final clustering ensemble result.Experiments are carried out on multiple UCI datasets,and the proposed algorithm is compared with the mainstream clustering ensemble algorithms,and the experimental results show that the proposed algorithm can obtain better clustering integration effect and has high robustness and stability.
关键词
聚类集成/数据挖掘/成员选择/簇加权/信息熵/共协矩阵Key words
clustering ensemble/data mining/member selection/cluster weighting/information entropy/co-association matrix引用本文复制引用
出版年
2024