武汉大学自然科学学报(英文版)2023,Vol.28Issue(3) :237-245.DOI:10.1051/wujns/2023283237

Fine-Tuning Pre-Trained CodeBERT for Code Search in Smart Contract

JIN Huan LI Qinying
武汉大学自然科学学报(英文版)2023,Vol.28Issue(3) :237-245.DOI:10.1051/wujns/2023283237

Fine-Tuning Pre-Trained CodeBERT for Code Search in Smart Contract

JIN Huan 1LI Qinying1
扫码查看

作者信息

  • 1. Information Engineering College,Jiangxi University of Technology,Nanchang 330000,Jiangxi,China
  • 折叠

Abstract

Smart contracts,which automatically execute on decentralized platforms like Ethereum,require high security and low gas con-sumption.As a result,developers have a strong demand for semantic code search tools that utilize natural language queries to efficiently search for existing code snippets.However,existing code search models face a semantic gap between code and queries,which requires a large amount of training data.In this paper,we propose a fine-tuning approach to bridge the semantic gap in code search and improve the search accuracy.We collect 80 723 different pairs of<comment,code snippet>from Etherscan.io and use these pairs to fine-tune,validate,and test the pre-trained CodeBERT model.Using the fine-tuned model,we develop a code search engine specifically for smart contracts.We evaluate the Recall@k and Mean Reciprocal Rank(MRR)of the fine-tuned CodeBERT model using different proportions of the fine-tuned data.It is encouraging that even a small amount of fine-tuned data can produce satisfactory results.In addition,we perform a com-parative analysis between the fine-tuned CodeBERT model and the two state-of-the-art models.The experimental results show that the fine-tuned CodeBERT model has superior performance in terms of Recalll@k and MRR.These findings highlight the effectiveness of our fine-tuning approach and its potential to significantly improve the code search accuracy.

Key words

code search/smart contract/pre-trained code models/program analysis/machine learning

引用本文复制引用

基金项目

Jiangxi Higher Education and Teaching Reform Project(JXJG-20-24-2)

Science and Technology Project of Jiangxi Education Department(GJJ212023)

Jiangxi University of Technology Education and Teaching Reform Project(JY2104)

出版年

2023
武汉大学自然科学学报(英文版)
武汉大学

武汉大学自然科学学报(英文版)

CSTPCDCSCD北大核心
影响因子:0.066
ISSN:1007-1202
参考文献量28
段落导航相关论文