首页|基于程序流程图和图注意力网络的跨语言代码抄袭检测方法

基于程序流程图和图注意力网络的跨语言代码抄袭检测方法

扫码查看
跨语言代码抄袭检测在软件知识产权保护和计算机程序设计类课程教学等领域有广泛的应用.然而,不同编程语言的语法差异降低了代码之间的相似度,导致抄袭检测的准确率较低.因此,本文提出一种基于程序流程图和图注意力网络的跨语言代码抄袭检测方法.首先,将代码转换为程序流程图,并利用图注意力网络提取程序流程图的特征作为代码的表示;其次,采用交叉匹配方法逐行对比代码的表示,以获得代码的相似特征向量;最后,拼接待检测代码的相似特征向量,并通过全连接神经网络计算抄袭的概率.实验结果表明,与现有的跨语言代码抄袭检测方法相比,本文提出的方法在查准率、查全率和F1值方面均有提高.其中,与基于属性计数的CLCDSA方法、基于抽象语法树的ASTLeamer方法相比,F1值分别提高了 11%和16%.
Cross Language Code Plagiarism Detection Based on Program Flow Chart and Graph Atten-tion Network
Cross language code plagiarism detection has been widely used in the fields such as software intellectual property protection and computer programming teaching.However,the syntactic differences between different programming languages reduce the similarity between codes,resulting in lower accuracy of plagiarism detection.Therefore,this paper proposes a cross language code plagiarism de-tection approach based on program flowchart and graph attention network.First,source code is converted into a program flowchart and its features are extracted as the representation of the code using graph attention network.Second,the representation of the code is com-pared line by line using cross-matching method to obtain the similarity feature vectors of the code.Finally,the similar feature vectors of the source code to be detected are combined,and the probability of plagiarism is calculated using a fully connected neural network.The experimental results show that compared with existing cross language code plagiarism detection approaches,the proposed approach in this paper has improved accuracy,recall,and F1 value.Compared with the CLCDSA based on attribute counting andASTleamer based on abstract syntax trees,the F1 values have been increased by 11%and 16%,respectively.

code plagiarism detectioncross programming languageprogram flow chartgraph attention network

张峰、韦友良、秦玉成

展开 >

山东科技大学计算机科学与工程学院,山东青岛 266590

代码抄袭检测 跨编程语言 程序流程图 图注意力网络

2025

小型微型计算机系统
中国科学院沈阳计算技术研究所

小型微型计算机系统

北大核心
影响因子:0.564
ISSN:1000-1220
年,卷(期):2025.46(1)