首页|基于联邦迁移的跨项目软件缺陷预测

基于联邦迁移的跨项目软件缺陷预测

扫码查看
跨项目软件缺陷预测基于已标注的多源项目数据构建模型,可以解决软件历史数据不足和标注代价高的问题。但在传统跨项目缺陷预测中,源项目数据持有者为了保护软件数据的商业隐私,而导致的"数据孤岛"问题直接影响了跨项目预测的模型性能。本文提出基于联邦迁移的跨项目软件缺陷预测方法(FT-CPDP)。首先,针对数据隐私泄露和项目间特征异构问题,提出基于联邦学习与迁移学习相结合的模型算法,打破各数据持有者间的"数据壁垒",实现隐私保护场景下的跨项目缺陷预测模型。其次,在联邦通信过程中添加满足隐私预算的噪声来提高隐私保护水平,最后构建卷积神经网络模型实现软件缺陷预测。基于NASA软件缺陷预测数据集进行实验,结果表明与传统跨项目缺陷预测方法相比,本文提出的FT-CPDP方法在实现软件数据隐私保护的前提下,模型的综合性能表现较优。
Cross-project Software Defect Prediction Based on Federated Transfer
Cross-project software defect prediction is based on labeled multi-source project data to build a model,which can address the problem of insufficient software historical data and high labeling cost.However,in traditional cross-project defect prediction,the problem of"data-island"caused by source project data holders to protect the business privacy of software data directly affects the model performance of cross-project prediction.Therefore,in this paper,we propose a cross-project software defect prediction method based on federated transfer(FT-CPDP).Firstly,to address the problem of data privacy leaking and feature heterogeneity between projects,this paper presents a model algorithm based on the combination of federal learning and migratory learning to break down the"data barrier"among data holders,and to achieve cross-project defect prediction model in the privacy protection scenario.Secondly,in the federal communication process,the level of privacy protection is increased by adding noise that satisfies the privacy budget.Finally,a convolution neural network model is built to realize software defect prediction.Experiments based on NASA software defect prediction dataset show that compared with traditional cross-project defect prediction methods,FT-CPDP method achieves better comprehensive performance on the premise of software data privacy protection.

software defect predictionfederated learningtransfer learningdifferential privacyconvolutional neural network

宋慧玲、李勇、张文静

展开 >

新疆师范大学计算机科学技术学院,新疆 乌鲁木齐 830054

新疆电子研究所软件事业部,新疆 乌鲁木齐 830010

南京航空航天大学高安全系统的软件开发与验证技术工信部重点实验室,江苏 南京 211106

软件缺陷预测 联邦学习 迁移学习 差分隐私 卷积神经网络

新疆维吾尔自治区天山青年计划项目新疆师范大学博士科研启动基金项目

2020Q019XJNUBS1905

2024

南京师大学报(自然科学版)
南京师范大学

南京师大学报(自然科学版)

CSTPCD北大核心
影响因子:0.427
ISSN:1001-4616
年,卷(期):2024.47(3)