[目的]研究共词网络结构变动对链路预测相似性指标预测效果的影响.[方法]本文从Web of Science核心合集中随机获取5个学科2015-2020年的文献数据;根据不同的关键词频次,分别构建不同网络拓扑结构特征的共词网络;选取AA、CN、RWR、Katz等15个传统链路预测相似性指标,在各共词网络上进行链路预测实验,以此对比分析不同指标在网络结构变动环境下的预测效果.[结果]不同学科中,共词网络的关键词频次越大,平均聚类系数越小,密度、网络传递性、平均度、平均度中心性、平均中介中心性、平均接近中心性越大,链路预测效果越差的可能性较大;反之,平均聚类系数越大,其余网络拓扑结构属性特征越小,链路预测效果越好的可能性较大.在所选取的15个相似性指标中,RWR指标在不同拓扑结构特征的共词网络中均表现最好;Katz指标的预测效果最稳定.从学科来说,各指标的预测结果在LAW学科中受网络结构变动的影响最大.[局限]由于计算空间有限,仅采用单个分类方法和评价指标,并且仅停留在基于节点相似性指标的探讨,缺乏对其他类别指标(如基于似然分析和基于概率模型等指标)的研究.[结论]从共词网络的关键词频次出发,探讨了各网络结构变动对链路预测效果的影响,为不同学科及不同大小的共词网络选取相似性指标提供了理论依据.
Influence of Network Structure Changes on Co-word Network Link Prediction
[Objective]This article studies the impacts of co-word network structure changes on link prediction using the similarity metric.[Methods]Firstly,we randomly retrieved the ISLS,LAW,BSS,COM,and Ocean literature from the core collection of Web of Science(2015 to 2020).Secondly,according to the diverse keyword frequencies,we constructed co-word networks with various topological features,such as the number of nodes and edges,the Average Clustering Coefficient,the Density,the Network Transitivity,and the Average Degree.Finally,we chose 15 traditional link prediction similarity metrics(e.g.,AA,CN,RWR,and Katz)to conduct link prediction experiments on various co-word networks.[Results]We compared and analyzed the prediction effects of different similarity metrics with the network structure change.(1)In different disciplines,in most cases,the larger the overall frequency of keywords in the co-word network,the smaller the average clustering coefficient,the larger the density,network transitivity,average degree,average degree centrality,average betweenness centrality and average closeness,and the greater the possibility of poor link prediction effect.Conversely,the larger the average clustering coefficient,the smaller the other network topologies,and the better the link prediction effect.(2)Among the 15 selected similarity indicators,the RWR metric performed the best in co-word networks with different topological characteristics.The prediction performance of the Katz metrics is the most stable in different co-word networks.The prediction results of each index in the LAW discipline are most affected by the change in keyword frequency.[Limitations]Due to limited computing space,we only used one classification method and one evaluation index in this study.In addition,we did not explore some node similarity indicators(i.e.,likelihood analysis-based metrics and probability model-based metrics).[Conclusions]This study provides a theoretical foundation for selecting similarity metrics of co-word networks of different disciplines.