首页|基于JSMA对抗攻击的去除深度神经网络后门防御方案

基于JSMA对抗攻击的去除深度神经网络后门防御方案

扫码查看
深度学习模型缺乏透明性和可解释性,在推理阶段触发恶意攻击者设定的后门时,模型会出现异常行为,导致性能下降.针对此问题,文章提出一种基于JSMA对抗攻击的去除深度神经网络后门防御方案.首先通过模拟JSMA产生的特殊扰动还原潜藏的后门触发器,并以此为基础模拟还原后门触发图案;然后采用热力图定位还原后隐藏触发器的权重位置;最后使用脊回归函数将权重置零,有效去除深度神经网络中的后门.在MNIST和CIFAR10 数据集上对模型性能进行测试,并评估去除后门后的模型性能,实验结果表明,文章所提方案能有效去除深度神经网络模型中的后门,而深度神经网络的测试精度仅下降了不到 3%.
Defense Scheme for Removing Deep Neural Network Backdoors Based on JSMA Adversarial Attacks
Deep learning models lack transparency and interpretability,and the abnormal behavior triggered by malicious attacks during the inference stage can lead to a decline in their performance.In response to this issue,this paper proposed a defense scheme for removing deep neural network(DNN)backdoors based on JSMA adversarial attacks.Firstly,the hidden backdoor trigger was restored using special disturbances generated by simulations of JSMA,and this foundation formed the basis for simulating the restoration of the backdoor trigger pattern.Secondly,a heatmap was used to locate the weight position of the restored hidden trigger.Finally,a ridge regression function was used to reset the weights to zero effectively removing the backdoor in the DNN.This paper tested the model on the MNIST and CIFAR10 datasets,and evaluated the performance of the model after the backdoor removal.The experimental results show that this scheme can effectively remove the backdoors in DNN models,with only less than a 3%decrease in the testing accuracy of the DNN.

deep learning modelcounter attackJSMAridge regression function

张光华、刘亦纯、王鹤、胡勃宁

展开 >

西安电子科技大学网络与信息安全学院,西安 710071

河北科技大学信息科学与工程学院,石家庄 050018

深度学习模型 对抗攻击 JSMA 脊回归函数

国家自然科学基金

U1836210

2024

信息网络安全
公安部第三研究所 中国计算机学会计算机安全专业委员会

信息网络安全

CSTPCDCHSSCD北大核心
影响因子:0.814
ISSN:1671-1122
年,卷(期):2024.24(4)
  • 24