基于大语言模型的内生安全异构体生成方法
Endogenous Security Heterogeneous Entity Generation Method Based on Large Language Model
陈昊然 1刘宇 2陈平3
作者信息
- 1. 复旦大学软件学院,上海 200433
- 2. 复旦大学计算机科学技术学院,上海 200433
- 3. 复旦大学大数据研究院,上海 200433
- 折叠
摘要
为应对软件系统中未知漏洞和后门带来的安全挑战,文章提出了一种基于大语言模型的内生安全异构体生成方法.该方法以内生安全策略为核心,对程序中安全薄弱的代码执行体进行异构,使得程序在受到攻击时能迅速切换至健康的异构体,保证系统稳定运行.再利用大语言模型生成多样化的异构体,并结合基于种子距离的方法优化现有的模糊测试技术,提高测试用例的生成质量和代码覆盖率,确保这些异构体在功能上的等价性.实验结果表明,该方法能有效修复代码漏洞,并生成功能等价的异构体;此外,相较于现有的AFL算法,优化后的模糊测试方法在达到相同代码覆盖率的情况下,所耗时间更少.因此,文章所提出的方法能够显著提高软件系统的安全性和鲁棒性,为未知威胁的防御提供了新的策略.
Abstract
To address the security challenges posed by unknown vulnerabilities and backdoors in software systems,the paper proposed an endogenous security heterogeneous entity generation method based on large language models.This method,centered around endogenous security strategies,diversified the execution bodies of code that were vulnerable within the program,enabling the system to swiftly switch to a healthy heterogeneous entity upon attack,thereby ensuring stable operation.Furthermore,it leveraged large language models to generate a variety of heterogeneous entities and optimized existing fuzz testing techniques with a seed distance-based method,enhancing the quality of test case generation and code coverage rates,ensuring the functional equivalence of these heterogeneous entities.Experimental results demonstrate that this method can effectively repair code vulnerabilities and produce functionally equivalent heterogeneous entities.Additionally,compared to the existing AFL algorithm,the optimized fuzz testing method consumes less time to achieve the same code coverage rate.It is evident that the method put forward in the paper can significantly improve the security and robustness of software systems,offering a new strategy for the defense against unknown threats.
关键词
内生安全/大语言模型/模糊测试Key words
endogenous security/large language model/fuzz testing引用本文复制引用
出版年
2024