陆军工程大学学报2024,Vol.3Issue(1) :1-11.DOI:10.12018/j.issn.2097-0730.20231212001

时频分区扰动实现音频分类对抗样本生成

Adversarial Example Generation for Audio Classification Based on Time-Frequency Partitioned Perturbation

张雄伟 张强 杨吉斌 孙蒙 李毅豪
陆军工程大学学报2024,Vol.3Issue(1) :1-11.DOI:10.12018/j.issn.2097-0730.20231212001

时频分区扰动实现音频分类对抗样本生成

Adversarial Example Generation for Audio Classification Based on Time-Frequency Partitioned Perturbation

张雄伟 1张强 1杨吉斌 1孙蒙 1李毅豪1
扫码查看

作者信息

  • 1. 陆军工程大学 指挥控制工程学院,江苏 南京 210007
  • 折叠

摘要

现有方法生成的音频分类对抗样本(adversarial example,AE)攻击成功率低,易被感知.鉴于此,设计了一种基于时频分区扰动(time-frequency partitioned perturbation,TFPP)的音频AE生成框架.音频信号的幅度谱根据时频特性被划分为关键和非关键区域,并生成相应的对抗扰动.在 TFPP 基础上,提出了一种基于生成对抗网络(generative adversarial network,GAN)的 AE 生成方法 TFPPGAN,以分区幅度谱为输入,通过对抗训练自适应调整扰动约束系数,同时优化关键和非关键区域的扰动.3 个典型音频分类数据集上的实验表明,与基线方法相比,TFPPGAN 可将 AE 的攻击成功率、信噪比分别提高 4.7%和5.5 dB,将生成的语音对抗样本的质量感知评价得分提高 0.15.此外,理论分析了 TFPP 框架与其他攻击方法相结合的可行性,并通过实验验证了这种结合的有效性.

Abstract

The adversarial examples generated by the existing methods generally suffer from a low at-tack success rate and are easy to perceive.To address these problems,this paper first designs an audio ad-versarial example generation framework based on Time-Frequency Partitioned Perturbation(TFPP).Le-veraging the time-spectral characteristics of the audio signal,the framework divides the magnitude spec-trum of the input audio signal into critical regions and non-critical regions,and generates the corresponding perturbations.Building upon this framework,this paper further proposes a Generative Adversarial Net-work(GAN)-based adversarial example generation method named TFPPGAN.TFPPGAN takes magni-tude spectra as inputs and uses adversarial training to simultaneously optimize the adversarial perturbations in critical and non-critical regions by adaptively adjusting the partitioned perturbation constraint coeffi-cients.Exhaustive comparison experiments are conducted on three typical audio classification datasets.The experimental results show that,compared with baseline methods,TFPPGAN can improve the attack success rate and signal-to-noise ratio by 4.7%and 5.5 dB respectively.The perceptual evaluation score of generated adversarial speech quality also improves by 0.15.Besides,this paper theoretically analyzes the feasibility of the combination of TFPP with other attack methods,and experimentally verify the effective-ness of this combination.

关键词

音频分类/对抗样本/生成对抗网络/分区扰动

Key words

audio classification/adversarial example/generative adversarial network/partitioned perturbation

引用本文复制引用

基金项目

国家自然科学基金(62071484)

出版年

2024
陆军工程大学学报
解放军理工大学科研部

陆军工程大学学报

影响因子:0.556
ISSN:2097-0730
参考文献量28
段落导航相关论文