Cross-project Software Defect Prediction Based on Federated Transfer
Cross-project software defect prediction is based on labeled multi-source project data to build a model,which can address the problem of insufficient software historical data and high labeling cost.However,in traditional cross-project defect prediction,the problem of"data-island"caused by source project data holders to protect the business privacy of software data directly affects the model performance of cross-project prediction.Therefore,in this paper,we propose a cross-project software defect prediction method based on federated transfer(FT-CPDP).Firstly,to address the problem of data privacy leaking and feature heterogeneity between projects,this paper presents a model algorithm based on the combination of federal learning and migratory learning to break down the"data barrier"among data holders,and to achieve cross-project defect prediction model in the privacy protection scenario.Secondly,in the federal communication process,the level of privacy protection is increased by adding noise that satisfies the privacy budget.Finally,a convolution neural network model is built to realize software defect prediction.Experiments based on NASA software defect prediction dataset show that compared with traditional cross-project defect prediction methods,FT-CPDP method achieves better comprehensive performance on the premise of software data privacy protection.