现代信息科技2024,Vol.8Issue(11) :44-48.DOI:10.19850/j.cnki.2096-4706.2024.11.009

基于样本损失值变化统一性的后门样本隔离

Backdoor Sample Isolation Based on the Uniformity of Samples'Loss Value Changes

张家辉
现代信息科技2024,Vol.8Issue(11) :44-48.DOI:10.19850/j.cnki.2096-4706.2024.11.009

基于样本损失值变化统一性的后门样本隔离

Backdoor Sample Isolation Based on the Uniformity of Samples'Loss Value Changes

张家辉1
扫码查看

作者信息

  • 1. 西安电子科技大学 网络与信息安全学院,陕西 西安 710126
  • 折叠

摘要

后门攻击对人工智能的应用构成潜在威胁.基于遗忘的鲁棒训练方法可通过隔离后门样本的子集并遗忘该子集,实现在不受信的数据集上训练无后门的模型.然而,错误隔离并遗忘干净样本会导致模型在干净数据上的性能受到损害.为了减少对干净样本的错误隔离,进而保护模型在干净数据上的性能,提出基于样本损失值变化统一性的后门样本隔离方案.后门样本训练过程中损失值的变化较大且较为统一,在隔离的潜在后门样本集中损失值变化统一性较低的样本可以被移除.实验结果表明,应用该方案能够减少对干净样本的错误隔离,在不影响后门防御效果的基础上保护了模型在干净数据上的性能.

Abstract

Backdoor attacks pose a potential threat to applying AI applications.Unlearning-based robust training methods achieve training models with no backdoor on untrusted datasets by isolating a subset of backdoor samples and unlearning it.However,incorrectly isolating and unlearning clean samples can lead to performance degradation of the model on clean data.In order to reduce false isolation of clean samples thus protecting model performance on clean data,a backdoor sample isolation scheme based on the uniformity of samples'loss value changes is proposed.During the training process,samples have large and uniform changes in loss value,and samples with low uniformity of loss value changes in the isolated and potential backdoor samples set are removed.Experimental results indicate that the application of the scheme benefits reducing the false isolation of clean samples,and protects the model's performance on clean data without compromising on defending against backdoor attacks.

关键词

人工智能安全/后门防御/鲁棒训练/后门样本隔离/神经网络模型

Key words

Artificial Intelligence security/backdoor defense/robust training/backdoor sample isolation/neural network model

引用本文复制引用

出版年

2024
现代信息科技
广东省电子学会

现代信息科技

ISSN:2096-4706
参考文献量10
段落导航相关论文