首页|基于强化学习的智能化渗透路径规划与求解优化

基于强化学习的智能化渗透路径规划与求解优化

扫码查看
在大数据技术广泛应用的背景下,传统渗透测试过于依赖专家经验和人工操作的问题日益显著.自动化渗透测试旨在解决上述问题以达到更准确全面地发现系统安全漏洞的效果,而寻找最优渗透路径是自动化渗透测试中最重要的任务.然而,当前的主流研究试图在包含大量冗余路径的原始解空间中规划最优路径,导致问题的求解复杂度大幅提升;此外,当前研究对漏洞利用和正奖励获取动作的评估不够.通过剔除大量冗余渗透路径,并采取漏洞利用样本增强方法和正奖励样本增强方法,可以简化问题并优化训练过程.基于此,结合解空间转换和样本增强,提出了 MASK-SALT-DQN算法,并定性和定量地分析了该方法对模型求解过程的影响,通过压缩比来衡量解空间转换给模型完成目标所带来的收益.实验表明,原始解空间中冗余解路径的比例始终保持在83%以上,证明了解空间转换的必要性.此外,在标准场景下,理论压缩比为57.2,实验压缩比与理论压缩比的误差仅为1.40%,且相比基线方法,MASK-SALT-DQN在所有实验场景下均有最优的表现,证明了其有效性和先进性.
Intelligent Penetration Path Planning and Solution Optimization Based on Reinforcement Learning
In the background of the widespread application of big data technology,the problems that traditional penetration tes-ting overly relies on expert experience and manual operation have become more significant.Automated penetration testing aims to solve the above problems,so as to discover system security vulnerabilities more accurately and comprehensively.Finding the opti-mal penetration path is the most important task in automated penetration testing.However,current mainstream research suffers from the following problems:1)seeking the optimal path in the original solution space,which contains numberous redundant paths,significantly increases the complexity of problem-solving;2)evaluation of vulnerability exploitation and positive reward ob-tainment actions is not enough.The problem-solving can be optimized by eliminating a significant number of redundant penetra-tion paths and employing exploit sample enhancement and positive reward sample enhancement methods.Therefore,this paper proposes the MASK-SALT-DQN algorithm by integrating solution space transformation and sample enhancement methods.It qualitatively and quantitatively analyzes the influence of the proposed algorithm on the model solving process,proposing the com-pression ratio to measure the benefits of solution space transformation.Experiments indicate that the proportion of redundant so-lution paths in the original solution space consistently remains over 83%,proving the necessity of solution space transformation.In addition,in standard experiment scenario,the theoretical compression ratio is 57.2,and the error between the experimental compression ratio and theoretical value is only 1.40%.Moreover,in comparison to baseline methods,MASK-SALT-DQN has the optimal performance in all experiment scenarios,which confirms its the effectiveness and superiority.

Penetration path planningReinforcement learningSolution space transformationSample enhancementCompression ratio

李成恩、朱东君、贺杰彦、韩兰胜

展开 >

华中科技大学网络空间安全学院 武汉 430000

武汉金银湖实验室 武汉 430000

渗透路径规划 强化学习 解空间转换 样本增强 压缩比

国家重点研发项目国家自然科学基金国家自然科学基金国家自然科学基金

2022YFB3103402620722006217217662127808

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(11)