The graph is constructed based on the positional relationship of the text words in textrank algorithm, and the the weight of words is calculated by using the algorithm of graph sorting. A lot of iterative operations are needed in the computing process, When the size of the data is large, the calculation time is particularly considerable. To solve this problem, A method of keyword extraction based on Spark GraphX is proposed. Using the graph framework of distributed computing provided by Spark GarpX, the text graph data is distributed on different nodes, and the text keyword extraction is efficiently realized. The result of experiments shows that automatic scoring method in the paper is more approximation to manual scoring. Therefore, the method has certain reasonableness. The key word extraction method based on Spark GraphX proposed in this paper is not only short in computation time, but also very close to the result of artificial annotation. and the experimen results showthat the method has a certain rationality and feasibility.
Spark GraphXkey words extractiongraph sortingword weight