基于联合特征分布匹配的跨项目缺陷预测
Cross-project defect prediction based on joint feature distribution matching
邱少健 1陆璐 2邹全义3
作者信息
- 1. 华南农业大学数学与信息学院,广东广州 510640
- 2. 华南理工大学计算机科学与工程学院,广东广州 510006;华南理工大学 现代产业技术研究院,广东 中山 528400
- 3. 华南理工大学 软件学院,广东广州 510006
- 折叠
摘要
为解决跨项目软件缺陷预测研究中存在的特征不完备和分类边界模糊问题,提出一种基于联合特征的双编码器分布匹配方法(DeDM-JF).利用卷积神经网络提取代码中与缺陷有关的结构语义特征,将其与人为选取的Handcrafted特征结合,形成联合特征;在此基础上,构建包含分布差异匹配层的双自编码器,学习跨项目全局和局部可迁移特征用于训练缺陷预测模型.面向软件缺陷数据仓库中的798对跨项 目缺陷预测任务开展实验,与相关的跨项目缺陷预测方法比较,DeDM-JF方法预测的F-measure和MCC指标有明显提升.
Abstract
To solve the problems of feature incompleteness and classification boundary ambiguity in cross-project software defect prediction,a joint feature-based dual-encoder distribution matching method(DeDM-JF)was proposed.Convolutional neural net-works were used to extract defect-related structural semantic features in codes,and they were combined with handcrafted fea-tures to form joint features.On this basis,two autoencoders including distribution matching layers were constructed to learn the global and local transferable feature across projects for prediction model training.Experiments on 798 pairs of cross-project defect prediction tasks were conducted in the software defect data warehouse.Compared with the related cross-project defect prediction methods,the F-measure and MCC predicted using DeDM-JF are significantly improved.
关键词
软件缺陷预测/跨项目缺陷预测/卷积神经网络/联合特征/自编码器/分布匹配/迁移学习Key words
software defect prediction/cross-project defect prediction/convolutional neural networks/joint feature/autoencoder/distribution matching/transfer learning引用本文复制引用
基金项目
国家自然科学基金面上基金项目(61370103)
中山市产学研重大基金项目(210610173898370)
广东省普通高校青年创新人才基金项目(2020KQNCX008)
广州市基础与应用基础研究基金项目(202201010312)
出版年
2024