首页|VulnTrace: Tracking and Detecting Code Vulnerabilities with Historical Commits and Semantic Embeddings
VulnTrace: Tracking and Detecting Code Vulnerabilities with Historical Commits and Semantic Embeddings
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
World Scientific
Open source software has evolved into a fundamental element of the contemporary information sector; however, security threats within its supply chain are persistently rising. Within the collaborative development framework of open source, the introduction of malicious code can lead to significant security vulnerabilities. Conventional methods for detecting these vulnerabilities, which rely on machine learning, face challenges such as a lack of sufficient datasets, inadequate deep semantic understanding, and limitations to single-vulnerability detection. To address these challenges, we introduce a novel approach named VulnTrace, which analyzes historical records of submissions in open source projects to construct a high-quality dataset of vulnerabilities with accurate labels. VulnTrace employs Word2Vec alongside Abstract Syntax Tree (AST) technologies to capture both the semantic and structural details of code segments and utilizes a Transformer model for precise vulnerability identification, thereby enhancing accuracy and interpretability in detection. Experimental results indicate that VulnTrace achieves approximately 93% accuracy, 95% precision, 83% recall and an Fl score of 88% in vulnerability detection tasks, significantly reducing false positives and demonstrating remarkable robustness.