Large-scale Network Community Detection Algorithm Based on MapReduce
Community detection is a fundamental problem in the field of social network mining.With the rapid generation of mas-sive data,traditional community detection algorithms are becoming increasingly difficult to handle large-scale social networks.Therefore,it is of great significance to design efficient community detection algorithms for large-scale networks.This paper pro-poses a new distributed algorithm based on MapReduce and k-center clustering.Firstly,the algorithm proposes the"friend circle coefficient"technique,which can measure the distance between nodes more accurately.Secondly,the algorithm proposes the"two-stage k-center clustering"technique,which incorporates node centrality heuristic information into the process of selecting center points and can significantly optimize the modularity of the results.Finally,the algorithm proposes a"community fusion method with modularity as the optimization goal"technique,which can automatically determine the number of communities in the net-work without prior knowledge.The evaluation results show that the proposed algorithm significantly outperforms the state-of-the-art community discovery algorithms in terms of modularity.For example,compared with the LPA algorithm,the proposed al-gorithm increases the modularity by an average of 9.19 times.
Community detectionk-center clusteringDistributed computingData miningBig data