Efficient One-off Weak Gap Sequential Pattern Mining Algorithm
Sequential Pattern Mining(SPM)with gap constraint,as an important branch of sequential pattern mining,can discover the repetitive occurrences of patterns in sequences.However,the current studies mainly focus on datasets with items,and each item in the sequence is considered to have the same effects.To address this problem,the One-off Weak gap sequential Pattern mining(OWP)algorithm is proposed,which includes three steps:preparation stage,support calculation,and candidate pattern generation.In the preparation stage,an inverted index is constructed and infrequent items are pruned.For support calculation,the occurrence positions are recorded using an inverted index,which avoids repeated scanning of the original dataset.For candidate pattern generation,a pattern-join strategy is used to reduce the generation of redundant candidate patterns.Finally,from the experimental results on six real datasets comprising datasets with items and itemsets,it is observed that the OWP algorithm improves the runtime by 2.653,1.348,and 3.592 times compared with the OWP-p,Ows-OWP,and OWP-e algorithms,respectively,and as concerns memory usage,it reduces by 3.51%,0.07%,and 5%,respectively.This indicates that the OWP algorithm can mine the patterns of user interest more efficiently.In addition,the OWP algorithm on the dataset with six times the size based on D1 increases by 3.763 times and memory usage increases by 2.310 times compared to that of D1.Clearly,both the increases in runtime and memory usage are smaller than those in dataset size,implying that the OWP algorithm has good scalability.
Sequential Pattern Mining(SPM)mining with itemsetgap constraintone-off conditionweak-gap constraint