VulnTrace: Tracking and Detecting Code Vulnerabilities with Historical Commits and Semantic Embeddings

扫码查看

原文链接

NETL
NSTL
World Scientific

外文摘要：Open source software has evolved into a fundamental element of the contemporary information sector; however, security threats within its supply chain are persistently rising. Within the collaborative development framework of open source, the introduction of malicious code can lead to significant security vulnerabilities. Conventional methods for detecting these vulnerabilities, which rely on machine learning, face challenges such as a lack of sufficient datasets, inadequate deep semantic understanding, and limitations to single-vulnerability detection. To address these challenges, we introduce a novel approach named VulnTrace, which analyzes historical records of submissions in open source projects to construct a high-quality dataset of vulnerabilities with accurate labels. VulnTrace employs Word2Vec alongside Abstract Syntax Tree (AST) technologies to capture both the semantic and structural details of code segments and utilizes a Transformer model for precise vulnerability identification, thereby enhancing accuracy and interpretability in detection. Experimental results indicate that VulnTrace achieves approximately 93% accuracy, 95% precision, 83% recall and an Fl score of 88% in vulnerability detection tasks, significantly reducing false positives and demonstrating remarkable robustness.

外文关键词：

Software supply chain securityvulnerability detectioncode feature extractioncode representation

作者：

Qijie Song、Jiaobo Jin、Tiantian Zhu、Tieming Chen、Mingqi Lv、Licheng Pan、Jian-Ping Mei、Xiang Pan

展开 >

作者单位：

College of Computer Science and Technology Zhejiang University of Technology, 310014, China

出版年：

2025

DOI：

10.1142/S0218194025500196

International journal of software engineering and knowledge engineering

ISSN：0218-1940

年,卷(期)：2025.35(5)

参考文献量45