首页|一种聚焦于提示的大语言模型隐私评估和混淆方法

一种聚焦于提示的大语言模型隐私评估和混淆方法

扫码查看
虽然大语言模型在语义理解方面表现优异,但频繁的用户交互带来了诸多隐私风险.文章通过部分回忆攻击和模拟推理游戏对现有的大语言模型进行隐私评估,证明了常见的大语言模型仍存在两类棘手的隐私风险,即数据脱敏处理可能影响模型响应质量以及通过推理仍能获取潜在的隐私信息.为了应对这些挑战,文章提出了一种聚焦于提示的大语言模型隐私评估和混淆方法.该方法以结构化进程展开,包括初始描述分解、伪造描述生成以及描述混淆.实验结果表明,文章所提方法的隐私保护效果较好,与现有方法相比,处理前后的模型响应之间的归一化Levenshtein距离、Jaccard相似度和余弦相似度均有一定程度下降.该方法也有效限制了大语言模型的隐私推理能力,准确率从未处理时的 97.14%下降至 34.29%.这项研究不仅加深了人们对大语言模型交互中隐私风险的理解,还提出了一种用于增强用户隐私安全的综合方法,可有效解决上述两类棘手的隐私风险场景下的安全问题.
A Prompt-Focused Privacy Evaluation and Obfuscation Method for Large Language Model
Although the impressive performance of large language model(LLM)in semantic understanding,frequent user interactions introduce many privacy risks.This paper evaluated the privacy evaluation of existing LLM through partial recall attacks and simulated inference games.The findings indicate that common LLM still face two challenging privacy risks:data anonymization can degrade the quality of model responses,and potential privacy information can still be inferred through reasoning.To address these challenges,this paper proposed a prompt-focused privacy evaluation and obfuscation method for large language model.The method unfolded through a structured process,including initial description decomposition,generation of fabricated descriptions,and description obfuscation.The experimental results show that the proposed method effectively enhances privacy protection,as evidenced by the reduction in normalized Levenshtein distance,Jaccard similarity,and cosine similarity between pre-processed and post-processed model responses compared to existing methods.Additionally,this approach significantly limits the inference capabilities of LLM,with accuracy dropping from 97.14%in unprocessed models to 34.29%.This study not only deepens the understanding of privacy risks in LLM interactions but also introduces a comprehensive approach to enhance user privacy security,effectively addressing the aforementioned challenging privacy risk scenarios.

privacy riskLLMprompt projectdescription obfuscation

焦诗琴、张贵杨、李国旗

展开 >

北京航空航天大学可靠性与系统工程学院,北京 100191

隐私风险 大语言模型 提示工程 描述混淆

可靠性与环境工程技术重点实验室基金

10100002019114012

2024

信息网络安全
公安部第三研究所 中国计算机学会计算机安全专业委员会

信息网络安全

CSTPCDCHSSCD北大核心
影响因子:0.814
ISSN:1671-1122
年,卷(期):2024.24(9)