For data with uneven density distribution,the density peak clustering algorithm disregards the sparsity differ-ence among intercluster samples,causing an inaccurate selection of the cluster center.Moreover,the allocation strategy easily divides the samples in sparse areas into dense areas by mistake,leading to a poor clustering effect.Therefore,the density peak clustering algorithm based on the weighted reverse nearest neighbor(DPC-WR)against datasets with un-even density distribution is proposed in this paper.First,the weight coefficient based on the sigmoid function is intro-duced to the local density formula to increase the weight of samples in sparse areas.Combined with the concept of re-verse nearest neighbor,the local density of samples is then redesigned to improve the recognition rate of cluster centers effectively.Second,an improved sample similarity strategy is introduced,which utilizes reverse nearest neighbors and shares this neighbor's information between samples to increase the similarity of samples in the same cluster.This effect-ively solves the problem of sample allocation error in sparse areas.Experiments on uneven density distribution,com-plex morphology,and UCI datasets show that the clustering effect of the DPC-WR algorithm outperforms that of IDPC-FA,FNDPC,FKNN-DPC,DPC,and DPCSA algorithms.
关键词
密度峰值聚类/密度分布不均/逆近邻/共享逆近邻/样本相似度/局部密度/分配策略/数据挖掘
Key words
density peak clustering/uneven density distribution/reverse nearest neighbor/shared reverse nearest neigh-bor/sample similarity/local density/distribution strategy/data mining