首页|Object-based cluster validation with densities

Object-based cluster validation with densities

扫码查看
Clustering validity indices are typically used as tools to find the correct number of clusters in a data set and/or to evaluate the quality of the clusters formed by clustering algorithms. Clustering validity in-dices measure separation and compactness of clusters. Typically, when applying a clustering algorithm, the input includes the number of clusters. After applying the algorithm with several different numbers of clusters, we determine the number of clusters to be the one with the best validity index. There are two types of clustering validity indices: external indices that are supervised, and internal indices that are un-supervised. The focus of this paper is on internal validity indices. Some existing internal validity indices capture the properties of the clusters by using representative statistics such as mean, variance, diameter, etc., however, these do not perform well when clusters have arbitrary shapes. One approach to overcome this issue is to use the density of the data objects in each cluster. That provides the advantage of captur-ing the full characteristics of the cluster which is most beneficial when there are clusters with arbitrary shapes. In the literature, a few density-based clustering validity indices have been proposed. However, some of them show poor performance when the clusters are not perfectly separated. Some others per-form poorly because they use only representative objects from each cluster instead of all objects. The contribution of this paper is an internal validity index named the object-based clustering validity index with densities (OCVD). OCVD is a single number that averages the density-based contribution of individ-ual data objects to both separation and compactness of clusters. The methodology behind calculating the density-based contributions of the objects is kernel density estimation. We show through several exper-iments that OCVD performs well in detecting the correct number of clusters in data sets with different cluster shapes including arbitrary shapes. (c) 2021 Elsevier Ltd. All rights reserved.

ClusteringClustering validity indexInternal indexDensity-based cluster validationUnsupervisedIMAGE SEGMENTATIONCROSS-VALIDATIONVALIDITY MEASUREKERNELCHOICEINDEX

Tavakkol, Behnam、Choi, Jeongsub、Jeong, Myong Kee、Albin, L. Susan

展开 >

Stockton Univ

West Virginia Univ

Rutgers State Univ

2022

Pattern Recognition

Pattern Recognition

EISCI
ISSN:0031-3203
年,卷(期):2022.121
  • 5
  • 71