SIHC:An Efficient Algorithm for k-core Query on Temporal Graphs
In real applications,the relationships between entities are usually modeled using a temporal graph,where each edge is associated with a timestamp denoting the time of the interaction between two entities at that moment.The k-core is a fundamental model for capturing dense subgraphs and has been widely studied in recent years.Given a temporal graph G and a time interval I=[s,e],the snapshot graph GI of G consists of all vertices of G and edges with timestamps within I,where edges with the same endpoints are merged into a single edge without timestamp.Based on the snapshot graph GI,the k-core subgraph query on a temporal graph G returns the corresponding k-core subgraph from GI.For the k-core subgraph query problem in temporal graphs,the state-of-the-art method is based on the PHC index(Pruned Historical Core-Index).For any possible value of k,the PHC index maintains a set Sk of vertices that may appear in the k-core subgraph within any time interval,and for each vertex of Sk,PHC stores a set of time intervals to determine if it belongs to the k-core subgraph of a given time interval.When querying the k-core subgraph based on the PHC index,it is necessary to access all vertices of Sk and determine the satisfiability of each vertex.Since Sk corresponds to all vertices in the k-core subgraph of the snapshot graph corresponding to the largest interval,and the snapshot graph of the given query interval is often much smaller than the snapshot graph of the largest interval,the query algorithm based on the PHC index involves many invalid judgments,requiring detection of a large number of vertices that are not in the result set.Even worse,the number of invalid detections increases as the query interval becomes shorter,resulting in low efficiency.We show that it is the adopted interval which is not the shortest one that results in the low efficiency of PHC index.We theoretically prove the correctness of the k-core subgraph query on temporal graphs based on the shortest intervals,based on which we propose a new index called Shortest Interval History Core(SIHC)index.The basic idea of the SIHC index is maintaining an inverted table from the shortest k-core interval to vertices,based on which we can directly get vertices in the k-core subgraph that meet the requirements by accessing qualified inverted lists according to the query interval.In this way,we can avoid a large number of invalid judgments required by the PHC index.We show that the interval of PHC can be used to compute the shortest interval of ours,based on which we propose the index construction algorithm based on PHC index.To improve the efficiency of index construction,we propose an optimized index construction algorithm.We further prove that constructing SIHC index has the same time complexity with that of PHC,and they both have the same index size.Finally,we conduct rich experiments on real-world temporal graphs,and the experimental results demonstrate that the proposed algorithm is one to two orders of magnitude faster than existing algorithms.
graph data managementtemporal graphsdense subgraphsk-coreshortest k-core interval