首页|VulnTrace: Tracking and Detecting Code Vulnerabilities with Historical Commits and Semantic Embeddings

VulnTrace: Tracking and Detecting Code Vulnerabilities with Historical Commits and Semantic Embeddings

扫码查看
Open source software has evolved into a fundamental element of the contemporary information sector; however, security threats within its supply chain are persistently rising. Within the collaborative development framework of open source, the introduction of malicious code can lead to significant security vulnerabilities. Conventional methods for detecting these vulnerabilities, which rely on machine learning, face challenges such as a lack of sufficient datasets, inadequate deep semantic understanding, and limitations to single-vulnerability detection. To address these challenges, we introduce a novel approach named VulnTrace, which analyzes historical records of submissions in open source projects to construct a high-quality dataset of vulnerabilities with accurate labels. VulnTrace employs Word2Vec alongside Abstract Syntax Tree (AST) technologies to capture both the semantic and structural details of code segments and utilizes a Transformer model for precise vulnerability identification, thereby enhancing accuracy and interpretability in detection. Experimental results indicate that VulnTrace achieves approximately 93% accuracy, 95% precision, 83% recall and an Fl score of 88% in vulnerability detection tasks, significantly reducing false positives and demonstrating remarkable robustness.

Software supply chain securityvulnerability detectioncode feature extractioncode representation

Qijie Song、Jiaobo Jin、Tiantian Zhu、Tieming Chen、Mingqi Lv、Licheng Pan、Jian-Ping Mei、Xiang Pan

展开 >

College of Computer Science and Technology Zhejiang University of Technology, 310014, China

2025

International journal of software engineering and knowledge engineering
  • 45