基于连续扰动生成方法的可持续对抗训练

Towards sustainable adversarial training with successive perturbation generation

林巍 ¹廖丽娟²

扫码查看

作者信息

1. 福建理工大学计算机科学与数学学院,中国福州市,350118;福建理工大学福建省大数据挖掘与应用技术重点实验室,中国福州市,350118
2. 西安理工大学经济与管理学院,中国西安市,710048
折叠

摘要

基于在线生成对抗性样本的对抗性训练在防御对抗性攻击和提高卷积神经网络(CNN)模型鲁棒性方面取得良好效果.然而,大多数现有对抗训练方法都致力于寻找强对抗例子迫使模型学习对抗数据分布,这不可避免地增加了大量计算开销并导致干净数据丢失.本文展示了在不同训练世代中渐进式地增强对抗样本本身的对抗强度能有效提高模型鲁棒性,适当的模型转换可以保持模型泛化性能,且这一转换过程的计算成本可忽略不计.因此,本文提出一种针对对抗训练的连续扰动生成方法(SPGAT),该方法通过在前一训练世代转移的对抗样本上添加扰动逐步增强对抗样本,并跨世代转换模型以提高对抗训练效率.实验表明,本文所提SPGAT方法既高效又有效;例如,所提方法计算时间为900分钟,标准对抗训练持续时间为4100分钟,对抗精度和干净样本精度性能提升分别超过7％和3％.在不同数据集上对SPGAT进行广泛评估,包括小规模MNIST、中等规模CIFAR-10和大规模CIFAR-100.实验结果表明,相比于目前最优方法,所提方法更有效.

Abstract

Adversarial training with online-generated adversarial examples has achieved promising performance in defending adversarial attacks and improving robustness of convolutional neural network models.However,most existing adversarial training methods are dedicated to finding strong adversarial examples for forcing the model to learn the adversarial data distribution,which inevitably imposes a large computational overhead and results in a decrease in the generalization performance on clean data.In this paper,we show that progressively enhancing the adversarial strength of adversarial examples across training epochs can effectively improve the model robustness,and appropriate model shifting can preserve the generalization performance of models in conjunction with negligible computational cost.To this end,we propose a successive perturbation generation scheme for adversarial training(SPGAT),which progressively strengthens the adversarial examples by adding the perturbations on adversarial examples transferred from the previous epoch and shifts models across the epochs to improve the efficiency of adversarial training.The proposed SPGAT is both efficient and effective;e.g.,the computation time of our method is 900 min as against the 4100 min duration observed in the case of standard adversarial training,and the performance boost is more than 7％and 3％in terms of adversarial accuracy and clean accuracy,respectively.We extensively evaluate the SPGAT on various datasets,including small-scale MNIST,middle-scale CIFAR-10,and large-scale CIFAR-100.The experimental results show that our method is more efficient while performing favorably against state-of-the-art methods.

关键词

对抗训练/对抗攻击/随机权重平均/机器学习/模型的泛化

Key words

Adversarial training/Adversarial attack/Stochastic weight average/Machine learning/Model generalization

引用本文复制引用

基金项目

Scientific Research and Development Foundation of Fujian University of Technology,China(GYZ220209)

出版年

2024

信息与电子工程前沿(英文)

浙江大学

信息与电子工程前沿(英文)

CSTPCD

影响因子：0.371

ISSN：2095-9184

参考文献量46

段落导航