首页|基于DBSCAN算法的海量网络数据增量并行化聚类方法

基于DBSCAN算法的海量网络数据增量并行化聚类方法

扫码查看
传统的聚类算法在面对动态递增的数据时,需要重新运行整个聚类过程,耗时且效率低.为有效应对这一挑战,提出基于DBSCAN算法的海量网络数据增量并行化聚类方法.采用Chernoff bounds准则分区网络数据,确保均衡且具代表性.应用DBSCAN算法聚类,精准识别高密度区域,同时处理噪声数据,实现网络数据的初始化聚类.针对动态数据,设定增量合并原则,高效合并新数据与原始聚类,保持聚类结果实时更新.实验结果表明,所提出的方法具有较高的置信水平(不低于97%),并且在聚类时间复杂度上表现出色,成功实现了对海量网络数据的增量并行化精准快速聚类.
Incremental parallelization clustering method of massive network data based on DBSCAN algorithm
Traditional clustering algorithms require running the entire clustering process again when facing dynamically in-creasing data,which is time-consuming and inefficient.To effectively address this challenge,a massive network data incremental parallelization clustering method based on DBSCAN algorithm is proposed.Using the Chernoff bounds criterion to partition network data,ensuring balance and representativeness.Applying the DBSCAN algorithm for clustering,accurately identifying high-density areas,while processing noisy data,to achieve initial clustering of network data.For dynamic data,set the principle of incremental merging to efficiently merge new data with original clusters and maintain real-time updates of clustering results.The experimental results show that the proposed method has a high confidence level(not less than 97%)and performs well in clustering time com-plexity,successfully achieving incremental parallelization,precise and fast clustering of massive network data.

DBSCAN algorithmnetwork datadata incrementparallel clusteringChernoff bounds criterionincremental merging rules

郑艳松、陶礼贵

展开 >

华南农业大学珠江学院人工智能学院,广州 510980

DBSCAN算法 网络数据 数据增量 并行化聚类 Chernoff bounds准则 增量合并规则

2024

现代计算机
中大控股

现代计算机

影响因子:0.292
ISSN:1007-1423
年,卷(期):2024.30(24)