Since data may be distributed on an irregular manifold,where the underlying clusters of-ten exhibit non-convex shapes and structures,the clustering problems for such data are collectively re-ferred to as non-convex clustering.However,existing mainstream non-convex clustering methods,in-cluding clustering based on original space and clustering based on Space Transformation,ignore the ex-plicit description of non-convex data patterns and fail to understand and describe the underlying mecha-nisms that produce such structures.Therefore,a descriptive clustering model is proposed to act on non-convex clustering.Firstly,a feature-weighted kernel density model with a hybrid form is defined based on the kernel density method that does not need to assume any probability distribution model in advance and does not restrict the shape of clusters,which cannot be achieved by traditional model-based cluster-ing methods.Secondly,the clustering objective function is derived based on the proposed model,and an optimization algorithm for solving the local density maximum of the density function is proposed based on the expectation maximization algorithm.The clusters are divided into those samples that rise to the same density maximum of the density function.Finally,a model-based non-convex clustering algorithm is defined.The algorithm does not need to manually define the number of clusters,and can assign an ex-plicit probability density function to each cluster,which helps to characterize clusters more robustly and accurately.In addition,the algorithm not only performs adaptive bandwidth selection,but also gives the feature weight of the sample space,enabling the automatic embedded feature selection during the cluste-ring process.
关键词
非凸聚类/描述性模型/基于模型的聚类/特征选择/核密度估计/局部密度极大值
Key words
non-convex clustering/descriptive model/model-based clustering/feature selection/ker-nel density estimation/local density maximum