Software Vulnerability Detection Method Based on Abstract Syntax Tree Feature Migration(AST-FMVD)
Deep learning has made significant progress in vulnerability detection.Existing vulnerability detection algorithms require a large amount of labeled data and build detection models through supervised methods.In a multi-language environment,due to the diversity of languages and the lack of labeled training samples,the detection model may have generalization problems,especially in the field of small samples,where performance may be poor.To solve this dilemma,transfer learning can serve as a solution.The core idea of transfer learning is the"learning by analogy"algorithm framework,transferring knowledge from one domain to another,thereby breaking the constraints of sample data.We propose a feature-based transfer vulnerability detection method.By clustering the syntax tree node in-formation of the code through semantic similarity,the node mapping relationship between different languages can be quickly and accurately constructed.At the same time,context-aware technology is introduced in the syntax tree mapping process to help solve ambiguous or vague grammatical structures,improving parsing performance.The proposed method enables the detection samples to transform from unknown domains to known ones,and utilizing the deep learning model built in the original domain,the new domain task can be transferred to the known domain,ultimately solving the application of cross-domain knowledge transfer.It is named AST-FMVD.Finally,we use the Java vulnerability detection model to detect files containing specific vulnerabilities,realizing the model's transfer application in the Python domain,proving the feasibility of AST-FMVD,and experimentally demonstrating that AST-FMVD can ensure the original model's good detection level in the target domain.
deep learningtransfer learningzero-shotvulnerability detectionabstract syntax tree