FedKRec:匿名化隐私保护的联邦学习推荐算法

扫码查看

原文链接

万方数据
维普

中文摘要：基于联邦学习的推荐系统将模型训练分散在多个本地设备上,而不在服务端共享数据,以实现用户数据的隐私保护.现有大多方法通常将服务端的物品特征矩阵广播到用户端计算损失并将物品的梯度回传到服务端更新,这种方式存在泄漏用户兴趣偏好的风险.为了解决这个问题,该文提出了一种基于匿名化的联邦学习推荐算法FedKRec来有效避免隐私泄露.具体来说,受K匿名思想的启发,FedKRec在向服务器上传梯度信息时将(隐私的)正样本的梯度隐藏在K个静态负样本的梯度之中.首先,通过对真实数据集的分析结果表明,正样本物品类别分布会在一定程度上泄漏用户兴趣偏好,提出一种考虑物品类别平衡的自适应负样本采样方法.其次,由于正样本和负样本梯度量级存在较大的差距,容易造成正样本信息泄漏,提出为正负样本梯度增加一定的高斯噪声,使得攻击者无法从中准确地识别出正样本.最后,从理论上证明了从物品类别分布上来看,这些加入噪声后的正负样本的集合不会泄露用户的偏好.在多个公开数据集上的实验结果表明,该文提出的FebKRec算法在有效保护了用户隐私的前提下达到了与传统方法可比的推荐性能.

外文标题：FedKRec:Privacy-preserving Federated Learning for Recommendation Based on Anonymity

外文摘要：A recommendation system based on federated learning disperses model training on multiple local devices without sharing data on the server to achieve privacy protection of user data.Most existing methods usually broad-cast the item feature matrix from the server to the user to calculate losses and update the gradient of the item back to the server,with a risk of leaking user interests and preferences.To address this issue,this article proposes a federa-ted learning recommendation algorithm FedKRec based on anonymization to avoid privacy breaches.Inspired by K's anonymous idea,FedKRec hides the gradient of(private)positive samples within the gradient of K static negative samples when uploading gradient information to the server.Firstly,the analysis of real datasets shows that the dis-tribution of positive sample item categories can leak user interest preferences.We propose an adaptive negative sam-ple sampling method that considers item category balance.Secondly,due to the significant difference in gradient magnitude between positive and negative samples,it is easy to cause information leakage in positive samples.We propose adding a certain amount of Gaussian noise to the gradient of positive and negative samples,which prevents attackers from accurately identifying positive samples.Finally,we theoretically prove that from the distribution of i-tem categories,the set of positive and negative samples with added noise will not reveal user preferences.The exper-imental results on multiple public datasets show that the proposed FebKRec algorithm achieves comparable recom-mendation performance with traditional methods while effectively protecting user privacy.

外文关键词：

federated learningdistributed learningrecommender systemprivacy-preservinganonymity technology

作者：

黎博、李世龙、姜琳颖、杨恩能、郭贵冰

展开 >

作者单位：

东北大学软件学院,辽宁沈阳 110169

关键词：

联邦学习分布式学习推荐系统隐私保护匿名技术

基金：

国家自然科学基金辽宁省科学计划项目中央高校基本科研业务专项资金项目

项目编号：

620320132023JH3/10200005N2317002

出版年：

2024

中文信息学报

中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心

影响因子：0.8

ISSN：1003-0077

年,卷(期)：2024.38(9)

参考文献量4