基于感知相似性的多目标优化隐蔽图像后门攻击

Perceptual Similarity-Based Multi-Objective Optimization for Stealthy Image Backdoor Attack

朱素霞 ¹王金印 ¹孙广路¹

扫码查看

作者信息

1. 哈尔滨理工大学计算机科学与技术学院哈尔滨 150080;黑龙江省智能信息处理及应用重点实验室(哈尔滨理工大学) 哈尔滨 150080
折叠

摘要

深度学习模型容易受到后门攻击,在处理干净数据时表现正常,但在处理具有触发模式的有毒样本时会表现出恶意行为.然而,目前大多数后门攻击产生的后门图像容易被人眼察觉,导致后门攻击隐蔽性不足.因此提出了一种基于感知相似性的多目标优化隐蔽图像后门攻击方法.首先,使用感知相似性损失函数减少后门图像与原始图像之间的视觉差异.其次,采用多目标优化方法解决中毒模型上任务间冲突的问题,从而确保模型投毒后性能稳定.最后,采取了两阶段训练方法,使触发模式的生成自动化,提高训练效率.最终实验结果表明,在干净准确率不下降的情况下,人眼很难将生成的后门图像与原始图像区分开.同时,在目标分类模型上成功进行了后门攻击,all-to-one攻击策略下所有实验数据集的攻击成功率均达到了 100％.相比其他隐蔽图像后门攻击方法,具有更好的隐蔽性.

Abstract

Deep learning models are vulnerable to backdoor attacks and behave normally when processing clean data,but they will exhibit malicious behavior when processing toxic samples with trigger patterns.However,most backdoor attacks currently produce backdoor images that are easily perceived by the human eye,resulting in insufficient stealthiness of backdoor attacks.Therefore,a multi-objective optimized covert image backdoor attack method based on perceptual similarity is proposed.Firstly,the visual difference between the backdoor image and the original image is reduced using a perceptual similarity loss function.Secondly,a multi-objective optimization method is used to solve the problem of inter-task conflict on the poisoning model,thus ensuring stable performance of the model after poisoning.Finally,a two-stage training method is adopted to automate the generation of trigger patterns and improve the training efficiency.The final experimental results show that it is difficult for human eye to distinguish the generated backdoor image from the original image without any degradation in clean accuracy.Meanwhile,the backdoor attack is successfully performed on the target classification model,and the attack success rate reaches 100％for all experimental datasets under the all-to-one attack strategy.Compared with other steganographic backdoor attack methods,our method has better stealthiness.

关键词

后门攻击/隐蔽后门/投毒攻击/深度学习/模型安全

Key words

backdoor attack/covert backdoor/poisoning attack/deep learning/model security

引用本文复制引用

基金项目

黑龙江省自然科学基金(LH2021F032)

黑龙江省重点研发计划(2022ZX01A34)

出版年

2024

计算机研究与发展

中国科学院计算技术研究所中国计算机学会

计算机研究与发展

CSTPCDCSCD北大核心

影响因子：2.649

ISSN：1000-1239

参考文献量25

段落导航