Top-Rank-K frequent pattern mining algorithm for large graphs
Frequent Pattern Mining(FPM)plays a crucial role in social analytics,which mines patterns and regularities about users'behaviour from vast amounts of social data,thereby provides new insights and decision support for research on social networks.However,it is not easy to set an appropriate support threshold for an FPM task.Moreover,as an NP-hard problem,FPM does not exist a polynomial-time algorithm.To address these problems,an algorithm that does not require an initial support threshold,denoted by ItrMiner,is proposed.In ItrMiner,a new metric which considers both size and support is proposed for measuring the interestingness of a pattern,and this metric is utilized to mine the Top-Rank-K frequent patterns.Meanwhile,to overcome the difficulty caused by the lack of initial support threshold,a pattern-recognition-based strategy and a pattern expansion strategy are proposed,which reduces excessive redundant candidate patterns.Extensive experiments on real and synthetic graphs show that ItrMiner performs well in terms of efficiency and scalability,especially on dense datasets,with a time overhead of only 13.2%of the baseline algorithm Top-K Graph Miner.Furthermore,the pattern expansion strategy is quite effective,which performs up to 9.5 times faster than the counterpart of ItrMinernopt without optimization.