Research on the Deep Learning Method Based on Data Feature Relevance and Adaptive Differential Privacy
康海燕 1王骁识1
扫码查看
点击上方二维码区域,可以放大扫码查看
作者信息
1. 北京信息科技大学信息安全系,北京 100192
折叠
摘要
基于差分隐私的深度学习隐私保护方法中,训练周期的长度以及隐私预算的分配方式直接制约着深度学习模型的效用.针对现有深度学习结合差分隐私的方法中模型训练周期有限、隐私预算分配不合理导致模型安全性与可用性差的问题,提出一种基于数据特征相关性和自适应差分隐私的深度学习方法(deep learning methods based on data feature Relevance and Adaptive Differential Privacy,RADP).首先,该方法利用逐层相关性传播算法在预训练模型上计算出原始数据集上每个特征的平均相关性;然后,使用基于信息熵的方法计算每个特征平均相关性的隐私度量,根据隐私度量对特征平均相关性自适应地添加拉普拉斯噪声;在此基础上,根据加噪保护后的每个特征平均相关性,合理分配隐私预算,自适应地对特征添加拉普拉斯噪声;最后,理论分析该方法(RADP)满足ε-差分隐私,并且兼顾安全性与可用性.同时,在三个真实数据集(MNIST,Fashion-MNIST,CIFAR-10)上的实验结果表明,RADP方法的准确率以及平均损失均优于AdLM(Adaptive Laplace Mechanism)方法、DPSGD(Differential Privacy with Stochastic Gradient Descent)方法和DPDLIGDO(Differentially Private Deep Learning with Iterative Gradient Descent Optimization)方法,并且RADP方法的稳定性仍能保持良好.
Abstract
In the deep learning privacy protection based on differential privacy,the length of the training period and the allocation of the privacy budget directly restrict the utility of the deep learning model. In the existing methods of deep learning combined with differential privacy,the model training cycle is limited and the budget allocation of a large number of feature privacy is unreasonable,which leads to poor security and availability of the model. We propose a method of deep learning methods based on data feature relevance and adaptive differential privacy (RADP). First,the method uses the layer-by-layer correlation propagation algorithm to calculate the average correlation of each feature parameter and the output re-sult on the original data set on the pre-trained model and uses the information entropy-based method to calculate the average correlation of each feature parameter. According to the privacy metric,the Laplace noise is adaptively added to the average correlation;on this basis,according to the average correlation of each feature parameter,the privacy budget is allocated rea-sonably,Laplace noise is added to the feature parameters;finally,theoretical analysis shows that the method proposed in this paper satisfies ε-differential privacy and take into account security and availability. Based on the experimental results on 3 real datasets MNIST,Fashion-MNIST,and CIFAR-10,the accuracy and average loss of RADP are better than those of the AdLM (Adaptive Laplace Mechanism) method,the DPSGD (Differential Privacy with Stochastic Gradient Descent) method and the DPDLIGDO (Differentially Private Deep Learning with Iterative Gradient Descent Optimization) method.Moreover,the stability of RADP method can still be maintained well.
关键词
差分隐私/深度学习/逐层相关性传播/信息熵/隐私度量/隐私预算/拉普拉斯机制
Key words
differential privacy/deep learning/layer-wise relevance propagation/entropy of information/privacy Metrics/privacy budget/laplace mechanism