计算机工程2024,Vol.50Issue(2) :132-139.DOI:10.19678/j.issn.1000-3428.0067063

基于后门的鲁棒后向模型水印方法

Robust Backward Model Watermarking Method Based on Backdoor

曾嘉忻 张卫明 张荣
计算机工程2024,Vol.50Issue(2) :132-139.DOI:10.19678/j.issn.1000-3428.0067063

基于后门的鲁棒后向模型水印方法

Robust Backward Model Watermarking Method Based on Backdoor

曾嘉忻 1张卫明 2张荣1
扫码查看

作者信息

  • 1. 中国科学技术大学信息科学技术学院,安徽 合肥 230027
  • 2. 中国科学技术大学网络空间安全学院,安徽 合肥 230027
  • 折叠

摘要

深度学习模型的训练成本高,但窃取成本低,容易被复制并扩散.模型的版权拥有者可以利用后门等方式在模型中嵌入水印,通过验证水印来证明模型版权.根据水印嵌入阶段的不同,模型水印又可分为前向模型水印和后向模型水印,前向模型水印在模型训练之初就嵌入水印,而后向模型水印的嵌入发生在模型原始任务训练完成后,计算量小,更为灵活.但是已有的后向模型水印方法鲁棒性较弱,不能抵抗微调、剪枝等水印擦除攻击.分析后向模型水印鲁棒性弱于前向模型水印的原因,在此基础上,提出一种通用的鲁棒后向模型水印方法,在水印嵌入时引入对模型中间层特征和模型输出的约束,减小水印任务对原始任务的影响,增强后向模型水印的鲁棒性.在CIFAR-10、CALTECH-101、GTSRB等数据集上的实验结果表明,该方法能有效提升后向模型水印在微调攻击下的鲁棒性,CIFAR-10数据集实验中的最优约束设置与后向模型水印基线相比,水印验证成功率平均提升24.2个百分点,同时,该方法也提升了后向模型水印在剪枝等攻击下的鲁棒性.

Abstract

The training cost of deep learning model is high;however,the stealing cost is low.This model is easy to copy and spread.The copyright owner of a model can embed a watermark in the model using a backdoor or another method.The copyright of the model is proven by verifying the embedded watermark.Watermark embedding strategies can be classified into forward and backward watermarking models.Forward model watermarking embeds watermarks from scratch,whereas backward model watermarking occurs after the original model training.Backward model watermarking requires fewer computations and is more flexible.However,unlike forward model watermarking,existing backward watermarking methods can be easily erased by fine-tuning,pruning,and other attacks.This study analyzes the reason for a weaker backward model watermarking compared to forward model watermarking.Based on this,a general method is proposed to enhance the robustness of backward model watermarking.This method introduces constraints on the middle-layer features and outputs of the model during the watermark embedding process.Experiments on CIFAR-10,CALTECH-101,GTSRB,and other datasets demonstrate that the proposed method can effectively improve the robustness of backward model watermarking against fine-tuning attacks,particularly on the CIFAR-10 dataset,improving the watermark success rate by an average of 24.2 percentage points compared to the baseline method.It also improves the robustness of backward model watermarking under pruning attacks.

关键词

深度学习模型/模型版权保护/模型水印/后门/鲁棒性

Key words

deep learning model/copyright protection of the model/model watermarking/backdoor/robustness

引用本文复制引用

基金项目

国家自然科学基金联合基金重点项目(U20B2047)

出版年

2024
计算机工程
华东计算技术研究所 上海市计算机学会

计算机工程

CSTPCD北大核心
影响因子:0.581
ISSN:1000-3428
参考文献量3
段落导航相关论文