一种基于妨碍特征的模糊测试工具测评方法

Evaluating Fuzzers Based on Fuzzing-hampering Features

郝高健 ¹李丰 ²霍玮 ¹邹维¹

扫码查看

作者信息

1. 中国科学院信息工程研究所北京中国 100093;中国科学院网络测评技术重点实验室北京中国 100195;网络安全防护技术北京市重点实验室北京中国 100195;中国科学院大学北京中国 100049
2. 中国科学院信息工程研究所北京中国 100093;中国科学院网络测评技术重点实验室北京中国 100195;网络安全防护技术北京市重点实验室北京中国 100195
折叠

摘要

模糊测试是一种高效的软件漏洞发现技术,在学术界和工业界有着丰富的研究成果和广泛的实践应用,产生了许多模糊测试工具.这些工具在技术特点及性能方面有着明显各异,需要通过测试来评估其效能,从而为工具选用以及改进提供指导.然而现有的模糊测试工具测评方法普遍存在一些情况下测评结果无法解释的问题.我们发现这与现有测评普遍忽略了模糊测试妨碍特征(Fuzzing-hampering Feature)有关.对此,本文深入研究妨碍特征对模糊测试的影响,归纳、提炼出5种妨碍特征,提出了一种将妨碍特征作为控制变量的、细粒度对比测评方法,并运用代码合成技术构建了包含118个目标程序的测试集Bench4I.经过对 6 款不同模糊测试工具的测评,结果表明,运用该方法可准确解释目标程序样本对被测工具功效的影响,进而推断工具的具体能力,有效提升了测评的可解释性.本文根据测评结果对实验中的被测工具提出了使用与改进建议,并实践了对 QSYM的改进,取得了良好的效果.

Abstract

Fuzz testing is an efficient method to find security critical bugs.In recent years,a plenty of works about fuzz testing have been proposed in both industry and academia.A variety of fuzz testing tools have been developed.These tools differ in techniques and performance so that the evaluation of fuzzers is demanded to understand these tools.But many existing evaluations have problems of bad interpretability,which leads to limited findings from the evaluation results.In this paper,we find that the evaluation results can be affected by plenty of factors,including fuzzing-hampering features contained in the target programs.However,existing evaluations pay little attention on fuzzing-hampering features,which leads to the inability to explain the reasons behind the evaluation results,even causing unclear or erroneous conclusions.In this regard,we propose a method to evaluate fuzzers based on fuzzing-hampering features.Our method treats fuzz-ing-hampering features as one of the controlled variables and performs fine-grained comparative testing to find out the relationships between evaluation results and fuzzing-testing features to identify the reason causing the different results,making the evaluation more interpretable.We also develop a method to construct benchmarks with which fuzz-ing-hampering features can be a controlled variable during the evaluation.To implement the idea and show its effective-ness,we summarized 5 fuzzing-testing features,quantitatively defined how to calculate the indicator of the capabilities of a fuzzer and constructed a bug benchmark named Bench4I,which included 118 synthetic programs with different fuzz-ing-hampering features.In the experiment,we evaluated 6 fuzzers.It shows that the tools'detailed capabilities can be in-ferred according to the indicators calculated from the evaluation results so that and the evaluation results become more interpretable.With the help of the evaluation,we also proposed several advices of using and improving these fuzzers.We put the improvement of QSYM into practice and gained a quite encouraging result.

关键词

模糊测试/测评/测试集/软件漏洞

Key words

fuzz testing/evaluation/benchmark/security critical bug

引用本文复制引用

出版年

2024

信息安全学报

CSTPCD

ISSN：

段落导航