Patch Correctness Verification Method Based on CodeBERT and Stacking Ensemble Learning
In recent years,automatic program repair has become an important research topics in the field of software engineering.However,most of the existing automatic repair technologies are based on patch generation and testing,which consumes a signifi-cant amount of time and cost in the patch verification process.In addition,because the test suite is not completeness,many candi-date patches can pass the test,but the test results are not consistent with the facts,which leads to the patch overfitting problem.To improve the efficiency of patch verification and alleviate patch overfitting issues,a static patch verification method is pro-posed.The method first uses the large pre-training model CodeBERT to automatically extract the semantic features of defect code fragments and patch code fragments,and then uses the historical defect repair patch data to train a Stacking ensemble learning model.The trained model can effectively verify the new defect repair patch.The verification ability of the proposed method is e-valuated on the 1 000 patch data related to the Defects4J defect dataset.Experimental results show that the static patch verifica-tion method can effectively verify the correctness of the patch,thereby improving the efficiency of patch verification.
Automatic program repairPatch verificationPre-training modelEnsemble learningDefects4J defect dataset