计算机科学2024,Vol.51Issue(z1) :818-825.DOI:10.11896/jsjkx.230600121

基于DNN模型输出差异的测试输入优先级方法

Test Input Prioritization Approach Based on DNN Model Output Differences

朱进 陶传奇 郭虹静
计算机科学2024,Vol.51Issue(z1) :818-825.DOI:10.11896/jsjkx.230600121

基于DNN模型输出差异的测试输入优先级方法

Test Input Prioritization Approach Based on DNN Model Output Differences

朱进 1陶传奇 2郭虹静1
扫码查看

作者信息

  • 1. 南京航空航天大学计算机科学与技术学院 南京 210016
  • 2. 南京航空航天大学计算机科学与技术学院 南京 210016;高安全系统的软件开发与验证技术工信部重点实验室 南京 210016;计算机软件新技术国家重点实验室 南京 210023;软件新技术与产业化协同创新中心 南京 210016
  • 折叠

摘要

深度神经网络测试需要大量的测试数据来保证DNN的质量,但大多数测试输入缺乏标注信息,而且对测试输入进行标注会带来高昂的人工代价.为了解决标注成本的问题,研究人员提出了测试输入优先级方法,筛选高优先级的测试输入进行标注.然而,大多数优先级方法都受到有限情景的影响,例如难以筛选出高置信度的误分类输入.为了应对上述挑战,文中将差分测试技术应用于测试输入优先级,并提出了基于DNN模型输出差异的测试输入优先级方法(DeepDiff).DeepDiff首先构建一个与原始模型具有相同功能的差分模型,然后计算测试输入在原始模型与差分模型之间的输出差异,最后为输出差异较大的测试输入分配更高的优先级.在实验验证中,我们对4个广泛使用的数据集和相应的8个DNN模型进行了研究.实验结果表明,在原始测试集上,DeepDiff的有效性比基线方法平均高出13.06%,在混合测试集上高出39.69%.

Abstract

Deep neural network(DNN)testing requires a large amount of test data to ensure the quality of DNN.However,most test inputs lack annotation information,and annotating test inputs is costly.Therefore,in order to address the issue of annotation costs,researchers have proposed a test input prioritization approach to screen high priority test inputs for annotation.However,most prioritization methods are influenced by limited scenarios,such as difficulty in filtering out high confidence misclassified in-puts.To address the above challenges,this paper applies differential testing technology to test input prioritization and proposes a test input prioritization method based on DNN model output differences(DeepDiff).DeepDiff first constructs a contrast model that has the same functionality as the original model,then calculates the output differences between the test inputs on the original model and the contrast model,and finally assigns higher priority to the test inputs with larger output differences.For empirical evidence,we conduct a study on four widely used datasets and the corresponding eight DNN models.Experimental results demon-strate that DeepDiff is 13.06%higher on average in effectiveness compared to the baseline approaches on the original test set and 39.69%higher on the mixed test set.

关键词

深度神经网络测试/测试输入优先级/差分测试/模型输出差异

Key words

Deep neural network testing/Test input prioritization/Differential testing/Model output differences

引用本文复制引用

出版年

2024
计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCDCSCD北大核心
影响因子:0.944
ISSN:1002-137X
参考文献量22
段落导航相关论文