Link Prediction in Patent Citation Networks Based on Graph and Semantic Representation Learning
[Objective]This study optimizes a link prediction model in the patent citation network to enhance the analysis and prediction of technological evolution.It also further improves theories and methods related to technology diffusion.[Methods]We constructed a new framework for link prediction modeling(Graph-PatentBERT-RF)based on the characteristics of patent literature.First,we used the GraphSAGE model to obtain the vectorized representation of the training set's patent citation network.In contrast,the PatentBERT model provides semantic representation vectors of patent texts in four thematic dimensions.Then,these vectors were combined with other features to train a random forest model.Finally,we obtained the optimized link prediction probabilities in the patent citation network.[Results]An empirical study in quantum sensing demonstrated that the Graph-PatentBERT-RF model achieves optimal comprehensive prediction performance,with an F1-score over 2.2%higher than the baseline models.Our model also illustrated the nonlinear relationships and complex interactions across more than four levels among citation relationships,multidimensional technical text,and time lag features.[Limitations]The data preprocessing steps need further optimization to improve the model's performance.[Conclusions]The constructed model enhances the overall predictive performance of patent citation networks,providing an optimized solution to the current issue of incomplete citation data,and contributes to the development of various applications in technology evolution analysis based on citation networks.