首页|面向人脸属性编辑的三阶段对抗扰动生成主动防御算法

面向人脸属性编辑的三阶段对抗扰动生成主动防御算法

扫码查看
针对恶意人脸属性编辑行为,基于取证的被动防御技术只能对篡改行为进行取证并不能防止其产生,从而难以消除恶意篡改行为已经造成的损失.因此,主动防御技术应运而生,其可以破坏属性编辑的输出从而避免人脸被篡改使用.然而,现有两阶段训练人脸篡改主动防御框架存在迁移性和扰动鲁棒性不足的问题,为此本文通过优化两阶段训练架构及损失函数和引入一个辅助分类器,提出一种三阶段对抗扰动主动防御框架.本文首先修改两阶段训练架构中的代理目标模型并基于此设计了训练扰动生成器的属性编辑损失,以提升代理模型的重建性能和属性约束能力,从而减少对代理模型的过拟合;其次,在训练阶段引入辅助分类器对代理模型提取的编码后特征进行源属性分类并基于此设计训练扰动生成器的辅助分类器损失,从而将原本的两阶段交替训练改为代理目标模型、辅助分类器和扰动生成器的三阶段交替训练,期望通过对抗攻击辅助分类器以促进对篡改模型的主动防御;最后,在扰动生成器的训练中,引入攻击层以促进对抗扰动对滤波和JPEG压缩的鲁棒性.实验结果验证,本文提出的框架能够比现有框架更好地将主动防御从白盒的代理目标模型迁移到黑盒的属性编辑模型,黑盒性能提升16.17%,且生成的对抗扰动较基线算法具有更强的鲁棒性,针对JPEG压缩的性能(PSNR)提升13.91%,针对高斯滤波提升17.76%.
Three-Stage Adversarial Perturbation Generation Active Defense Algorithm for Facial Attribute Editing
With the gradual maturity of deep generation technology,the facial image generated by facial attribute editing technologies appears to mix the spurious with the genuine.Once these facial attribute editing technologies are maliciously used,such as infringing on personal privacy,and maliciously guiding public opinion,etc.,they may trigger some moral,social,and security issues.Regarding the resolution of these malicious facial attribute editing behaviors,although the current passive defense technology based on forensics has achieved considerable performance,it can only provide evidence for tampering behavior and cannot prevent its occurrence,which is difficult to eliminate the losses caused by malicious tampering behavior.Then,the active defense technology has emerged.It prevents face from being tampered with by disrupting the output of facial attribute editing.However,the existing two-stage training active defense framework for facial attribute editing has the issues of insufficient transferability and perturbation robustness.Therefore,this paper proposes a three-stage adversarial perturbation active defense framework for facial attribute editing by optimizing the two-stage training architecture and its loss function and introducing an auxiliary classifier.This paper first modifies the substitute target model in the two-stage training architecture and designs the attribute editing loss for the training of perturbation generator to improve the reconstruction performance and attribute constraint ability of the substitute model,thus reducing the overfitting issue of the substitute model;Secondly,the auxiliary classifier is introduced in the training phase to classify the source attributes of the encoded features extracted by the substitute model and the corresponding auxiliary classifier loss is designed for the training of perturbation generator.Then,the original two-stage alternate training is changed to the three-stage alternate training of substitute target model,auxiliary classifier and perturbation generator,so that it is expected to promote active defense against tampering model by countering auxiliary classifier;Finally,an attack layer is introduced in the training of the perturbation generator to enhance the robustness of the adversarial perturbation against filtering and joint photographic experts group(JPEG)compression.Experimental results on five facial attribute editing models(StarGAN,AttGAN with difference attribute vector input,AttGAN with target attribute vector input,STD-GAN,and style-aware model)show that the proposed framework can better migrate active defense from the white-box substitute model to the black-box attribute editing model than the existing frameworks,improving 16.17%in terms of peak signal-to-noise ratio(PSNR)in the case of black-box,and the generated adversarial perturbation has stronger robustness against JPEG compression and filtering than the baseline,improving 13.91%in terms of PSNR for JPEG compression,and 17.76%for the Gaussian filtering.

facial attribute editingactive defenseadversarial attackauxiliary classifieralternate training

陈北京、张海涛、李玉茹

展开 >

南京信息工程大学教育部数字取证工程研究中心 南京 210044

南京信息工程大学江苏省大气环境与装备技术协同创新中心 南京 210044

南京信息工程大学计算机学院、网络空间安全学院 南京 210044

人脸属性编辑 主动防御 对抗攻击 辅助分类器 交替训练

国家自然科学基金国家自然科学基金

6207225162072250

2024

计算机学报
中国计算机学会 中国科学院计算技术研究所

计算机学报

CSTPCD北大核心
影响因子:3.18
ISSN:0254-4164
年,卷(期):2024.47(3)
  • 1
  • 27