首页|基于随机平滑的通用黑盒认证防御

基于随机平滑的通用黑盒认证防御

扫码查看
近年来,基于深度神经网络(DNNs)的图像分类模型在人脸识别、自动驾驶等关键领域得到了广泛应用,并展现出卓越的性能.然而,深度神经网络容易受到对抗样本攻击,从而导致模型错误分类.为此,提升模型自身的鲁棒性已成为一个主要的研究方向.目前大部分的防御方法,特别是经验防御方法,都基于白盒假设,即防御者拥有模型的详细信息,如模型架构和参数等.然而,模型所有者基于隐私保护的考虑不愿意共享模型信息.即使现有的黑盒假设的防御方法,也无法防御所有范数扰动的攻击,缺乏通用性.因此,本文提出了一种适用于黑盒模型的通用认证防御方法.具体而言,本文首先设计了一个基于查询的无数据替代模型生成方案,在无需模型的训练数据与结构等先验知识的情况下,利用查询和零阶优化生成高质量的替代模型,将认证防御场景转化为白盒,确保模型的隐私安全.其次,本文提出了基于白盒替代模型的随机平滑和噪声选择方法,构建了一个能够抵御任意范数扰动攻击的通用认证防御方案.本文通过分析比较原模型和替代模型在白盒认证防御上的性能,确保了替代模型的有效性.相较于现有方法,本文提出的通用黑盒认证防御方案在CIFAR10数据集上的效果取得了显著的提升.实验结果表明,本文方案可以保持与白盒认证防御方法相似的效果.与之前基于黑盒的认证防御方法相比,本文方案在实现了所有Lp的认证防御的同时,认证准确率提升了 20%以上.此外,本文方案还能有效保护原始模型的隐私,与原始模型相比,本文方案使成员推理攻击的成功率下降了 5.48%.
Universal Certified Defense for Black-Box Models Based on Random Smoothing
In recent years,the widespread application of image classification models based on deep neural networks(DNNs)has significantly impacted critical fields,including facial recognition and autonomous driving.These models have showcased remarkable performance,revolutionizing the way we interact with technology.However,despite their success,deep neural networks are not without vulnerabilities,particularly in the face of adversarial attacks,which can lead to misclassi-fication and compromise the integrity of these models.Addressing this challenge has become a pivotal research direction,as ensuring the robustness of these models is essential for their real-world deployment.Currently,many defense methods,especially empirical ones,operate under the white-box assumption.This assumption relies on defenders having access to detailed information about the model,including its architecture and parameters.Unfortunately,model owners often hesitate to share such sensitive information due to privacy concerns.Even existing black-box defense methods struggle to provide comprehensive protection against attacks involving all norms,lacking the necessary universality.This inherent limitation has spurred the need for innovative solutions.In response to this challenge,this paper proposes a groundbreaking universal black-box certified defense method applicable to a broad spectrum of black-box models.The key innovation lies in the design of a query-based data-free substitute model generation scheme.Unlike traditional methods,this scheme eliminates the need for training data and prior knowledge of the model structure.Leveraging queries and zero-order optimization,it generates high-quality substitute models,effectively transforming the certified defense scenario into a white-box setting without compromising model privacy.Furthermore,this paper introduces additional layers of security through the incorporation of random smoothing and noise selection methods based on the white-box substitute model.These enhancements contribute to the construction of a universal certified defense solution capable of resisting adversarial attacks involving any norm.To validate the effectiveness of the substitute model,performance comparisons are made with the original model under white-box certified defense conditions.The experimental results,particularly on the CIFAR10 dataset,showcase the superiority of the proposed universal black-box certified defense solution over existing methods.The solution not only achieves significant improvements in certi-fication accuracy but also maintains similar performance to white-box certified defense methods.Notably,compared to previous black-box certified defense methods,the proposed solution demonstrates over a 20%improvement in certification accuracy while effectively safeguarding the privacy of the original model.Specifically,the proposed solution successfully reduces the success rate of membership inference attacks by 5.48%,further highlighting its robustness and practical applicability in real-world scenarios.

deep neural networkscertified defenserandom smoothingblack-box modelssub-stitute models

李瞧、陈晶、张子君、何琨、杜瑞颖、汪欣欣

展开 >

空天信息安全与可信计算教育部重点实验室武汉大学国家网络安全学院 武汉 430079

武汉大学日照信息研究院 山东 日照 276800

地球空间信息技术协同创新中心 武汉 430079

深度神经网络 认证防御 随机平滑 黑盒模型 替代模型

国家重点研发计划国家自然科学基金国家自然科学基金湖北省重点研发计划山东省重点研发计划武汉市科技计划

2022YFB310210062206203620761872022BAA0392022CXPT0552023010302020707

2024

计算机学报
中国计算机学会 中国科学院计算技术研究所

计算机学报

CSTPCD北大核心
影响因子:3.18
ISSN:0254-4164
年,卷(期):2024.47(3)
  • 30