Study on Binary Code Similarity Detection Based on Jump-SBERT
Binary code similarity detection technology plays an important role in different security fields.Aiming at the problems of the existing binary code similarity detection methods,such as high computational cost and low accuracy,incomplete semantic information recognition of binary function and single evaluation data set,a binary code similarity detection technique based on Jump-SBERT is proposed.Jump-SBERT has two main innovations.One is to use twin networks to build SBERT network struc-ture,which can reduce the calculation cost of the model while keeping the calculation accuracy unchanged.The other is to intro-duce jump recognition mechanism,which enables Jump-SBERT to learn the graph structure information of binary functions.Thus,the semantic information of binary function can be captured more comprehensively.Experimental results show that the re-cognition accuracy of Jump-SBERT can reach 96.3%in the small function pool(32 functions)and 85.1%in the large function pool(10000 functions),which is 36.13%higher than state-of-the-art(SOTA)methods.Jump-SBERT is more stable in large-scale binary code similarity detection.Ablation experiments show that both of the two main innovation points have positive effects on Jump-SBERT,and the contribution of jump recognition mechanism is up to 9.11%.