Backdoor Sample Isolation Based on the Uniformity of Samples'Loss Value Changes
Backdoor attacks pose a potential threat to applying AI applications.Unlearning-based robust training methods achieve training models with no backdoor on untrusted datasets by isolating a subset of backdoor samples and unlearning it.However,incorrectly isolating and unlearning clean samples can lead to performance degradation of the model on clean data.In order to reduce false isolation of clean samples thus protecting model performance on clean data,a backdoor sample isolation scheme based on the uniformity of samples'loss value changes is proposed.During the training process,samples have large and uniform changes in loss value,and samples with low uniformity of loss value changes in the isolated and potential backdoor samples set are removed.Experimental results indicate that the application of the scheme benefits reducing the false isolation of clean samples,and protects the model's performance on clean data without compromising on defending against backdoor attacks.
Artificial Intelligence securitybackdoor defenserobust trainingbackdoor sample isolationneural network model