Density peaks clustering algorithm with nearest neighbor optimization for data with uneven density distribution
Data with uneven density distribution are those where the distribution of samples varies in sparsity between class clusters.When dealing with uneven density datasets,the density peak clustering(DPC)algorithm tends to find the center of class clusters in the higher density area and assign samples from sparse class clusters to dense class clusters.To avoid these defects,this paper proposes a density peaks clustering algorithm with nearest neighbor optimization(DPC-NNO)for data with uneven density distribution.The DPC-NNO algorithm combines the reverse nearest neighbor and k-nearest neighbor to define a new local density that improves the local density of sparse samples,allowing the algorithm to find class cluster centers more accurately;shared nearest neighbors are introduced to define the assignment strategy to calculate the similarity between samples and construct a similarity matrix to make the samples of the same class clusters more closely related and avoid the wrong assignment of samples.In this paper,we compare the DPC-NNO algorithm with IDPC-FA,DPCSA,FNDPC,FKNN-DPC,and DPC algorithms.Experimental results show that the DPC-NNO algorithm can achieve excellent clustering results on uneven density datasets,and the comprehensive performance of the DPC-NNO algorithm is better than other comparison algorithms on complex datasets and UCI datasets.
density peaksclustering analysisuneven density distributionreverse nearest neighborshare nearest neighborsimilarity of samples