首页|重新审视面向CNN模型的测试样例选取:考虑模型校准

重新审视面向CNN模型的测试样例选取:考虑模型校准

扫码查看
深度神经网络(DNN)已被广泛应用于各种任务,而在部署前对DNN进行充分测试尤为重要,因此需要构建能够对DNN进行充分测试的测试集.由于标注成本受限,通常通过测试样例选取的方式得到测试子集.然而,人们使用基于预测不确定性的方法(该方法在发现误分类样例和提升重训练表现方面表现出卓越的能力)进行测试样例选取时,忽略了对测试样例的预测不确定性估计是否准确的问题.为了填补上述研究的空白,通过实验定性和定量地揭示了模型校准程度和测试样例选取任务中使用的不确定性指标之间的相关性.校准模型会使模型有更准确的预测不确定性估计,因此研究了不同校准程度的模型用不确定指标选取得到的测试子集质量是否不同.在3个公开数据集和4个卷积神经网络(CNN)架构模型上进行了充分的实验和分析,结果表明在CNN架构模型上:1)不确定指标和模型校准存在一定程度的相关性;2)校准程度好的模型所选择的测试子集质量优于校准程度差的模型选择的测试子集质量.在发现模型误分类样例的能力上,70.57%经过校准训练后的模型对应的实验结果优于未校准模型对应的实验结果.因此在测试样例选取任务中考虑模型校准十分重要,且可以使用模型校准来提升测试样例选取的表现.
Revisiting Test Sample Selection for CNN Under Model Calibration
Deep neural networks are widely used in various tasks,and model testing is crucial to ensure their quality.Test sample selection can solve the issue of labor-intensive manual labeling by strategically choosing a small set of data to label.However,existing selection metrics based on predictive uncertainty neglect the accuracy of the estimation of predictive uncertainty.To fill the gaps of the above studies,we conduct a systematic empirical study on 3 widely used datasets and 4 convolutional neural net-works(CNN)to reveal the relationship between model calibration and predictive uncertainty metrics used in test sample selec-tion.We then compare the quality of the test subset selected by calibrated and uncalibrated models.The findings indicate a degree of correlation between uncertainty metrics and model calibration in CNN models.Moreover,CNN models with better calibration select higher-quality test subsets than models with poor calibration.Specifically,the calibrated model outperforms the uncalibrated model in detecting misclassified samples in 70.57%of the experiments.Our study emphasizes the importance of considering mo-del calibration in test selection and highlights the potential benefits of using a calibrated model to improve the adequacy of the tes-ting process.

Convolutional neural network testingPredictive uncertaintyModel calibrationTest sample selection

赵通、沙朝锋

展开 >

复旦大学计算机科学技术学院 上海 200433

卷积神经网络测试 预测不确定性 模型校准 测试样例选取

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(6)
  • 34