Generalized adversarial defense against unseen attacks:a survey
Deep learning-based models have achieved impressive breakthroughs in various areas in recent years.How-ever,they are vulnerable when their inputs are affected by imperceptible but adversarial noises,which can easily lead to wrong outputs.To tackle this problem,many defense methods have been proposed to mitigate the effect from these threat models for deep neural networks.As adversaries seek to improve the technologies of disrupting the models'performances,an increasing number of attacks that are unseen to the model during the training process are emerging.Thus,the defense mechanism,which defends against only some specific types of adversarial perturbations,is becoming less robust.The abil-ity of a model to generally defend against various unseen attacks becomes pivotal.Unseen attacks should be as different as possible from the attacks used in the training process in terms of theory and attack performance rather than adjustment of parameters from the same attack method.The core is to defend against any attacks via efficient training procedures,while the defense is expected to be as independent as possible from adversarial attacks during training.Our survey aims to sum-marize and analyze the existing adversarial defense methods against unseen adversarial attacks.We first briefly review the background of defending against unseen attacks.One of the main reasons that the model is robust against unseen attacks is that it can extract robust features through a specially designed training mechanism without explicitly designing a defense mechanism that has special internal structures.A robust model can be achieved by modifying its structure or designing additional modules.Therefore,we divide these methods into two categories:training mechanism-based defense and model structure-based defense.The former mainly seeks to improve the quality of the robust feature extracted by the model via its training process.1)Adversarial training is one of the most effective adversarial defense strategies,but it can easily overfit to some specific types of adversarial noises.Well-designed attacks for training can explicitly improve the model's ability to explore the perturbation space during training,which directly helps the model learn more representative features compared with traditional adversarial attacks in the perturbation space.Adding regularization terms is another way to obtain robust models by improving the robust features from the basic training process.Furthermore,we introduce some adversarial training-based methods combined with knowledge from other domains,such as domain adaptation,pre-training,and fine tuning.Different examples make different contributions to the model's robustness.Thus,example reweighting is also a way to achieve robustness against attacks.2)Standard training is the most basic training method in deep learning.Data augmentation methods focus on example diversity of standard training,while adding regularization terms into standard train-ing aims to enhance the model outputs'stabilization.Pre-training strategy aims to achieve a robust model within a pre-defined perturbation bound.3)We also found that contrastive learning is a useful strategy as its core ideas about feature similarity match well with the goal of acquiring representative robust features.Model structure-based defense,meanwhile,mainly focuses on intrinsic drawbacks from the model's structure.It is divided into structure optimization for target network methods and input data pre-processing methods according to how the structures are modified.1)Structure optimization for target network aims to enhance the model's ability to obtain useful information from inputs and features because the net-work itself is susceptible to variations from them.2)Input data pre-processing focuses on eliminating the threats from examples before feeding them into the target network.Removing adversarial noise from inputs or detecting adversarial examples to reject them are two popular strategies because they are easily modeled and rely less on adversarial training examples compared with other methods such as adversarial training.Finally,we analyze the trends of research in this area and summarize some research on other related domains.1)Defending against multiple adversarial perturbation well cannot make sure that the model is robust against various unseen attacks but contributes to the improvement of robustness against one specific type of perturbation.2)With the development of defense against unseen adversarial attacks,some auxiliary tools such as the accelerating module have been proposed.3)Defense against unseen common corruptions is beneficial for applications of defense methods because adversarial perturbations cannot represent the whole perturbation space in the real world.To summarize,defending against attacks that are totally different from the attacks during training has stronger gener-alizability.The analysis based on this goal shows differences from traditional surveys about adversarial defense.We hope that this survey can further motivate research on defending against unseen adversarial attacks.