Research and Simulation Analysis on the Optimization of BIRCH Data Clustering Algorithm
In recent years,one of the most widely studied problems in data analysis is the identification of clusters or dense regions in multidimensional datasets.To address the issues of large datasets and minimizing I/O costs,a hierarchical data clustering method called Balanced Iterative Reducing and Clustering using Hierarchies(BIRCH)has been proposed.In this article,the performance of the BIRCH clustering algorithm is evaluated in terms of time/space efficiency,Calinski-Harabasz index under varying algorithm parameters,and clustering quality.A performance comparison is also conducted with the classic CLARANS algorithm.