融合内外部特征水印的模型保护方案
Model protection scheme for fusion of internal and external feature watermarks
彭维平 1刘家宝 1平源 2马迪 1宋成1
作者信息
- 1. 河南理工大学 计算机科学与技术学院,河南 焦作 454003
- 2. 许昌学院 信息工程学院,河南 许昌 461000
- 折叠
摘要
针对经典模型水印技术在保护模型所有权过程中存在鲁棒性差、提取率低等问题,融合白、黑盒水印优势,提出了一种特征嵌入的模型保护方案.按照香农熵大小进行数据集样本划分的策略,将数据集样本划分为良性样本、风格迁移样本、关键密钥样本;利用风格迁移样本集对模型嵌入外部特征,将关键密钥样本标签嵌入模型内部特征;通过训练二元分类器并利用掩码梯度下降方法修改极少量参数让模型产生特定输出来综合判断模型是否被窃取.实验结果表明,所提方案用较小开销保证了水印的高保真度,在标签查询、知识蒸馏等攻击下仍具有较高稳定性,且能规避恶意检测风险.
Abstract
In response to the limitations of classical model watermarking techniques in protecting model ownership,such as poor robustness and low extraction rates,we propose a fusion watermarking model protection scheme that integrates the ad-vantages of white-box and black-box watermarking.A strategy is proposed to divide the dataset samples into benign sam-ples,style transfer samples,and key samples based on the size of Shannon entropy.The style transfer sample set is used to embed external features into the model,while the labels of key samples are used to embed internal features into the model.A binary classifier is trained,and a mask gradient descent method is employed to modify a minimal number of parameters to generate specific outputs for comprehensive judgment of model theft.Experimental results demonstrate that the proposed scheme ensures high fidelity of the watermark with less overhead.It exhibits high stability against attacks such as label que-rying and knowledge distillation,while also avoiding the risk of malicious detection.
关键词
模型保护/融合水印/数据划分/特征嵌入Key words
model protection/fusion watermarking/data partitioning/feature embedding引用本文复制引用
基金项目
河南省重点研发与推广专项(212102210084)
出版年
2024