基于样本损失值变化统一性的后门样本隔离

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：后门攻击对人工智能的应用构成潜在威胁.基于遗忘的鲁棒训练方法可通过隔离后门样本的子集并遗忘该子集,实现在不受信的数据集上训练无后门的模型.然而,错误隔离并遗忘干净样本会导致模型在干净数据上的性能受到损害.为了减少对干净样本的错误隔离,进而保护模型在干净数据上的性能,提出基于样本损失值变化统一性的后门样本隔离方案.后门样本训练过程中损失值的变化较大且较为统一,在隔离的潜在后门样本集中损失值变化统一性较低的样本可以被移除.实验结果表明,应用该方案能够减少对干净样本的错误隔离,在不影响后门防御效果的基础上保护了模型在干净数据上的性能.

外文标题：Backdoor Sample Isolation Based on the Uniformity of Samples'Loss Value Changes

外文摘要：Backdoor attacks pose a potential threat to applying AI applications.Unlearning-based robust training methods achieve training models with no backdoor on untrusted datasets by isolating a subset of backdoor samples and unlearning it.However,incorrectly isolating and unlearning clean samples can lead to performance degradation of the model on clean data.In order to reduce false isolation of clean samples thus protecting model performance on clean data,a backdoor sample isolation scheme based on the uniformity of samples'loss value changes is proposed.During the training process,samples have large and uniform changes in loss value,and samples with low uniformity of loss value changes in the isolated and potential backdoor samples set are removed.Experimental results indicate that the application of the scheme benefits reducing the false isolation of clean samples,and protects the model's performance on clean data without compromising on defending against backdoor attacks.

外文关键词：

Artificial Intelligence securitybackdoor defenserobust trainingbackdoor sample isolationneural network model

作者：

张家辉

展开 >

作者单位：

西安电子科技大学网络与信息安全学院,陕西西安 710126

关键词：

人工智能安全后门防御鲁棒训练后门样本隔离神经网络模型

出版年：

2024

DOI：