首页|基于加权核密度估计与微簇合并的密度峰值聚类算法

基于加权核密度估计与微簇合并的密度峰值聚类算法

扫码查看
密度峰值聚类(DPC)算法作为一种基于密度的聚类算法,因其简单高效而得到广泛应用,但DPC算法易将一个高密度类簇划分为多个类簇且极易产生分配连带错误.对此,提出了基于加权核密度估计与微簇合并的密度峰值聚类算法(WEMCM-DPC),利用核密度估计和加权K近邻重新定义局部密度,缩小高密度类簇和稀疏类簇的局部密度差异,使类簇中心的识别更加准确;提出了新的微簇间相似性度量准则,减少数据集中过于稀疏或密集样本对其他样本的影响,为微簇合并提供了依据,并且改善了 DPC算法的分配连带错误,使聚类结果更加准确.密度分布不均数据集和真实数据集的实验结果表明,WEMCM-DPC算法的聚类结果优于DPC和4个改进算法.
Density Peaks Clustering Algorithm Based on Weighted Kernel Density Estimation and Micro-cluster Merging
The density peaks clustering(DPC)algorithm is a widely used density-based clustering algo-rithm because of its simplicity and efficiency.However,although the DPC algorithm can easily di-vide a high-density cluster into multiple clusters,it is very easy to generate assignment linkage errors.In this regard,we propose a DPC algorithm based on weighted kernel density estimation and microcluster merging(WEMCM-DPC)that redefines the local density using kernel density es-timation and weighted K-nearest neighbors and reduces high-density clusters.The local density difference of sparse clusters improves cluster center identification.A new similarity measure be-tween microclusters is proposed that can reduce the influence of too sparse or dense samples in data on other samples,provide a basis for the merging of microclusters and improving the allocation error of the DPC algorithm,and improve accuracy of the clustering results.The WEMCM-DPC algorithm has been found to outperform the DPC and the four improved algorithms in clustering performance,as demonstrated by experimental data on datasets with uneven density distributions and real datasets.

density peaksclusteringkernel density estimationK-nearest neighbormicro-cluster merging

李智冈、吕莉、谭德坤、康平、樊棠怀

展开 >

南昌工程学院信息工程学院,江西南昌 330099

南昌工程学院南昌市智慧城市物联感知与协同计算重点实验室,江西南昌 330099

密度峰值 聚类 核密度估计 K近邻 微簇合并

国家自然科学基金项目江西省教育厅科技项目江西省教育厅科技项目江西省重点研发计划项目江西省重点研发计划项目

62066030GJJ201915GJJ22080320192BBE5007620203BBGL73225

2024

信息与控制
中国自动化学会 中国科学院沈阳自动化研究所

信息与控制

CSTPCD北大核心
影响因子:0.576
ISSN:1002-0411
年,卷(期):2024.53(3)