针对大语言模型生成的密码应用代码安全性分析

Security Analysis of Cryptographic Application Code Generated by Large Language Model

郭祥鑫 ¹林璟锵 ¹贾世杰 ²李光正¹

扫码查看

作者信息

1. 中国科学技术大学网络空间安全学院,合肥 230027
2. 中国科学院信息工程研究所,北京 100085
折叠

摘要

随着大语言模型在软件开发领域的广泛应用,在提升开发效率的同时也引入了新的安全风险,特别是在对安全性要求较高的密码学应用领域.文章针对大语言模型提出了一个密码应用安全评估的开源提示词库LLMCryptoSE,该词库包含460个密码场景自然语言描述提示词.同时,通过对大语言模型生成的代码片段进行深入分析,着重评估了密码API使用不当的情况,采用静态分析工具CryptoGuard结合人工的方法进行审查.在评估ChatGPT3.5、文心 3.5 和星火 3.5 等主流大语言模型时,文章对生成的 1380 个代码片段进行了密码误用检测,发现 52.90%的代码片段至少存在一处密码误用,其中星火 3.5 大模型表现较佳,误用率为 48.48%.文章不仅揭示了当前大语言模型在密码应用代码安全性方面所面临的挑战,还为模型的使用者和开发者提出了一系列增强安全性的建议,旨在为大语言模型在密码领域的推广应用提供实践指导.

Abstract

With the extensive application of large language model(LLM)in software development,the role in enhancing development efficiency has also introduced new security risks,particularly in the field of cryptography applications that demand high security.This paper proposed an open-source prompt dataset named LLMCryptoSE,containing 460 natural language description prompts of cryptographic scenarios.It aimed to assess the security of code generated by LLM for cryptographic applications.At the same time,through an in-depth analysis of code snippets generated by LLM,this paper primarily evaluated the misuse of cryptographic API,employing the methodology that combined the static analysis tool CryptoGuard with manual review to conduct a detailed evlatuation of 1380 code snippets.The assessment of three mainstream LLM,including ChatGPT 3.5,ERNIE 3.5,and Spark 3.5,revealed that 52.90%of the code snippets contained at least one instance of cryptographic misuse,with Spark 3.5 showing a relatively better performance with a misuse rate of 48.48%.Based on these findings,the study not only reveals the current challenges in cryptographic application security faced by LLM,but also offers a series of recommendations for LLM users and developers to enhance security.These are aims at providing practical guidance for improving the application of LLM in cryptographic fields.

关键词

大语言模型/密码应用安全提示词/密码误用检测

Key words

large language model/cryptographic application security prompts/cryptographic misuse detection

引用本文复制引用

基金项目

国家自然科学基金(62272457)

国家重点研发计划(2020YFB1005803)

出版年

2024

信息网络安全

公安部第三研究所　中国计算机学会计算机安全专业委员会

信息网络安全

CSTPCDCSCDCHSSCD北大核心

影响因子：0.814

ISSN：1671-1122

参考文献量20

段落导航