首页|面向密度峰值聚类的高效相似度度量

面向密度峰值聚类的高效相似度度量

扫码查看
针对密度峰值聚类(density peaks clustering,DPC)计算复杂度高的问题,提出一种面向密度峰值聚类的高效相似度度量(efficient similarity measure,ESM)法,通过仅度量最近邻之间的相似度构建不完全相似度矩阵.最近邻的选择基于一个随机第三方数据对象,无需另外引入参数.基于ESM法构建相似度矩阵,提出一种改进的高效密度峰值聚类(efficient density peaks clustering,EDPC)算法,在保持准确率的同时提高DPC识别聚类中心的效率.理论分析和试验结果表明,ESM法通过减少一定不相似的相似度,可以有效提高DPC及其改进算法基于K最近邻的密度峰值聚类(density peaks clustering based on K-nearest neighbors,DPC-KNN)和模糊加权 K 最近邻密度峰值聚类(fuzzy weighted K-nearest neighbors density peaks clustering,FKNN-DPC)的计算效率,具有较强的可扩展性.
Efficient similarity measure for density peaks clustering
An efficient similarity measure(ESM)method was proposed for density peaks clustering(DPC)to address the issue of high computational complexity.The ESM method constructed an incomplete similarity matrix by only measuring the similarity be-tween nearest neighbors,without the need for additional parameters,based on a randomly selected third-party data object.Based on the similarity matrix constructed by ESM,an improved efficient density peaks clustering(EDPC)algorithm was proposed to im-prove the efficiency of DPC to identify cluster centers while maintaining accuracy.Theoretical analysis and experimental results proved that the proposed ESM could effectively improve the computational efficiency of DPC and its improved algorithms density peaks clustering based on K-nearest neighbors(DPC-KNN)and fuzzy weighted K-nearest neighbors density peaks clustering(FKNN-DPC)by reducing certain dissimilar similarity measures.ESM had robust scalability.

density peaks clusteringcluster centersimilarity matrixcomputational complexitylarge-scale dataset

王丽娟、徐晓、丁世飞

展开 >

中国矿业大学计算机科学技术学院,江苏 徐州 221116

徐州工业职业技术学院信息工程学院,江苏 徐州 221114

密度峰值聚类 聚类中心 相似度矩阵 计算复杂度 大规模数据集

国家自然科学基金中央高校基本科研业务费专项江苏省高等职业院校专业带头人高端研修资助项目

622062962022QN10952022GRFX063

2024

山东大学学报(工学版)
山东大学

山东大学学报(工学版)

CSTPCD北大核心
影响因子:0.634
ISSN:1672-3961
年,卷(期):2024.54(3)