GBDEN:A Fast Clustering Algorithm for Large-scale Data Based on Granular Ball
Clustering is a technique used to partition the objects in a dataset into groups or clusters based on their similar fea-tures,aiming to form groups where objects within each group are more similar to each other than to those in other groups.Densi-ty-based clustering is one of the unsupervised clustering methods that does not require the number of clusters to be specified in advance.On the contrary,it adaptively determines the clusters based on the density of the data.Compared to methods like K-MEANS,density-based clustering is less sensitive to the selection of initial points.It also can produce more robust and reliable clustering results.Among various density-based clustering algorithms,DENCLUE(DENsity-based CLUstEring)utilizes a hill-climbing approach,which is grounded in a solid mathematical foundation.At the same time,it performs well in datasets with con-siderable noise,allowing clustering of arbitrarily shaped clusters in high-dimensional datasets.However,processing large-scale datasets with DENCLUE requires significant computational resources and time.To address this challenge,this paper proposes a fast clustering algorithm for large-scale data based on granular ball.This involves creating a coarse-grained granular ball initially,which is then refined into fine-grained granular balls.These granular balls served as input for the DENCLUE algorithm for clus-tering.Experimental findings demonstrate the effectiveness of this approach across multiple datasets.
ClusteringGranular computingGranular ballDENCLUEKernel function