代码自动生成是提高软件开发效率的有效途径之一,已有的研究一般将代码生成作为一项序列到序列的任务,并且大规模预训练语言模型的微调过程往往伴随着高昂的算力开销.文中提出了 一种基于提示学习的轻量化代码生成方法(Prompt Learning based Parameter-Efficient Code Generation,PPECG),该方法通过查询代码语料库中与当前需求最相似的结果作为提示,指导预训练语言模型进行代码生成,并且在该过程中固定模型的绝大多数参数以实现减少算力开销的 目的.为了验证PPECG的有效性,文中选取了两个代码生成数据集,分别是CONCODE和Solidity4CG,通过计算生成结果的BLEU,Code-BLEU以及Exact Match值来验证PPECG的有效性,实验结果表明,PPECG有效地减少了微调时的显存开销,且在上述指标上基本接近甚至优于目前的SOTA方法,能够较好地完成代码生成的任务.
Prompt Learning Based Parameter-efficient Code Generation
Automatic code generation is one of the effective ways to improve the efficiency of software development.Existing re-search often regards code generation as a sequence-to-sequence task,and the process of fine-tuning of large-scale pre-trained lan-guage models is often accompanied by high computing cost.In this paper,a method of prompt learning based parameter-efficient code generation is proposed.This method guides the pre-trained language model to generate code by querying the result which is most similar to the current intent in the code corpus,and most of the parameters of the model are fixed in the process to achieve the effect of reducing computing cost.In order to verify the effectiveness of PPECG,two datasets for code generation are selected in this paper,namely CONCODE and Solidity4CG.The effectiveness of PPECG is verified by calculating the BLEU,CodeBLEU and Exact Match values of the generated results.Experimental results show that PPECG effectively reduces the graphic memory cost during fine-tuning,and is basically close to or even better than the current SOT A method on the above benchmarks,which is capable of completing code generation tasks well.
Code generationPrompt learningPre-trained language modelInformation retrievalSmart contract