基于分布散度的自适应模糊测试优化方法

Self-adaptive fuzzing optimization method based on distribution divergence

扫码查看

原文链接

维普
万方数据

中文摘要：为了提高覆盖引导的模糊测试的性能,提出一种利用分布散度和深度强化学习模型自适应模糊测试的优化方法.基于过程间控制流图构建过程间比较流图,以刻画被测程序分支判断变量对应的空间随机场,并使用蒙特卡洛方法提取模糊测试变异策略产生的随机场分布特征;构建深度图卷积神经网络,以提取过程间比较流图的特征嵌入,并将该神经网络作为深度强化学习的深度Q网络;基于双重深度Q网络模型建立在线深度强化学习模型,从而训练智能代理以优化模糊测试变异策略.该深度强化学习模型利用种子文件及相关区块对应的随机场分布特征定义状态,将种子文件重点变异区块的选择定义为动作,并将动作前后随机场近似分布的分布散度定义为奖励.针对该模糊测试优化方法实现了原型系统,并对其开展了多轮次的持续24 h的评估.实验结果表明,在测试集FuzzBench上,该原型系统代码覆盖速度和总体覆盖明显优于基线模糊测试器AFL++与HavocMAB,并在多数基准测试目标上优于CmpLog;在测试集Magma上,该原型系统在基准测试目标openssl、libxml和sqlite3上具备更强的漏洞触发能力.

外文摘要：To improve the performance of coverage-guided fuzzing,a method for self-adaptive optimization of fuzz-ing using distribution divergence and a deep reinforcement learning model was proposed.An interprocedural com-parison flow graph was first constructed based on the interprocedural control flow graph to characterize the spatial random field corresponding to the branch condition variables of the program under test,and the distribution fea-tures of the random field generated by a fuzzing mutation strategy were extracted using the Monte Carlo method.Then,a deep graph convolutional neural network was constructed to extract the feature embeddings of the interpro-cedural comparison flow graph,and this neural network was used as the deep Q-network for deep reinforcement learning.Finally,an online deep reinforcement learning model was established based on the dual deep Q-network model,and an intelligent agent was trained to optimize the fuzzing mutation strategy.This deep reinforcement learning model defined its state using the random field distribution features corresponding to the seed file and the associated blocks.The selection for the focused mutation block of a seed file was defined as an action,and the dis-tribution divergence of the approximate distributions of the random fields before and after the action was defined as the reward.A prototype was implemented for this fuzzing optimization method,and multiple rounds of up to 24 hours of evaluation were carried out on this prototype.The experimental results show that on the benchmark Fuzz-Bench,the code coverage speed and overall coverage achieved by the prototype are significantly better than those of the baseline fuzzer AFL++and HavocMAB,and better results are achieved on most benchmarks compared to CmpLog.On the benchmark Magma,stronger vulnerability triggering capability is demonstrated by the prototype on the benchmarks openssl,libxml,and sqlite3.

外文关键词：

fuzzingdeep reinforcement learningdistribution divergencebranch condition variables

作者：

许航、计江安、马哲宇、张超

展开 >

作者单位：

网络空间安全教育部重点实验室,河南郑州 450001

清华大学网络科学与网络空间研究院,北京 100080

关键词：

模糊测试深度强化学习分布散度分支判断变量

出版年：

2024

DOI：

10.11959/j.issn.2096-109x.2024079

网络与信息安全学报

人民邮电出版社

网络与信息安全学报

CSTPCD

ISSN：2096-109X

年,卷(期)：2024.10(6)