Text adversarial attack capability enhancement method based on automatic dilution
Using adversarial examples for training enhances the robustness of deep neural networks.Therefore,improving the success rate of adversarial attacks is significant in the field of adversarial example research.Diluting original samples can bring them closer to the decision boundary of the model,thereby increasing the success rate of adversarial attacks.However,existing dilution algorithms suffer from issues such as reliance on manually generated dilution pools and single dilution targets.This paper proposes a method to enhance the capability of text adversarial attacks based on automatic dilution,called the Automatic Multi-positional Dilution Preprocessing(AMDP)algorithm.The AMDP algorithm eliminates the reliance on manual assistance in the dilution process and generates different dilution pools for different datasets and target models.Additionally,AMDP extends the targeted words for dilution,broadening the search space of dilution operations.As an input transformation method,AMDP can be combined with other adversarial attack algorithms to further enhance attack performance.Experimental results demonstrate that AMDP increases the success rate by approximately 10%on average on BERT,WordCNN,and WordLSTM classification models,while reducing the average modification rate of original samples and the average number of accesses to the target model.
adversarial machine learningadversarial samplestext dilutionclassification boundariesnatural language processing