基于DBSCAN算法的海量网络数据增量并行化聚类方法

扫码查看

原文链接

万方数据

中文摘要：传统的聚类算法在面对动态递增的数据时,需要重新运行整个聚类过程,耗时且效率低.为有效应对这一挑战,提出基于DBSCAN算法的海量网络数据增量并行化聚类方法.采用Chernoff bounds准则分区网络数据,确保均衡且具代表性.应用DBSCAN算法聚类,精准识别高密度区域,同时处理噪声数据,实现网络数据的初始化聚类.针对动态数据,设定增量合并原则,高效合并新数据与原始聚类,保持聚类结果实时更新.实验结果表明,所提出的方法具有较高的置信水平(不低于97%),并且在聚类时间复杂度上表现出色,成功实现了对海量网络数据的增量并行化精准快速聚类.

外文标题：Incremental parallelization clustering method of massive network data based on DBSCAN algorithm

外文摘要：Traditional clustering algorithms require running the entire clustering process again when facing dynamically in-creasing data,which is time-consuming and inefficient.To effectively address this challenge,a massive network data incremental parallelization clustering method based on DBSCAN algorithm is proposed.Using the Chernoff bounds criterion to partition network data,ensuring balance and representativeness.Applying the DBSCAN algorithm for clustering,accurately identifying high-density areas,while processing noisy data,to achieve initial clustering of network data.For dynamic data,set the principle of incremental merging to efficiently merge new data with original clusters and maintain real-time updates of clustering results.The experimental results show that the proposed method has a high confidence level(not less than 97%)and performs well in clustering time com-plexity,successfully achieving incremental parallelization,precise and fast clustering of massive network data.

外文关键词：

DBSCAN algorithmnetwork datadata incrementparallel clusteringChernoff bounds criterionincremental merging rules

作者：

郑艳松、陶礼贵

展开 >

作者单位：

华南农业大学珠江学院人工智能学院,广州 510980

关键词：

DBSCAN算法网络数据数据增量并行化聚类 Chernoff bounds准则增量合并规则

出版年：

2024

DOI：