A Multi-Layer Feature Importance Attack Method Based on Iterative Accumulated Gradients
The transferability of adversarial samples is crucial for attacking unknown models,providing feasibility for adversarial attacks in practical scenarios.Existing transfer attacks tend to indiscriminately distort features to degrade predic-tion accuracy of the source model.However,they overlook the intrinsic features of objects in the images.Inspired by exist-ing work on feature importance extraction,this paper proposes a method termed multi-layer accumulated gradient attack,which disrupts crucial object-aware features that dominate the model decision.Specifically,this paper introduces the itera-tive accumulated gradients to quantify feature importance,which are highly correlated with the target object and helpful to improve transfer attacks.Furthermore,combining attacks across various intermediate layers,this paper finally achieves multi-layer accumulated gradient attack.Compared with the best performing method,experimental results demonstrate a more efficient performance of the proposed one,the attacking success rates of which are comparable as to the normally trained models while increased by 2.6 percentage points as to the defense models.