非独立同分布场景下的联邦学习优化方法

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：联邦学习能够在不泄露数据隐私的情况下合作训练全局模型,但这种协作式的训练方式在现实环境下面临参与方数据非独立同分布(Non-IID)的挑战:模型收敛慢、精度降低的问题.许多现有的联邦学习方法仅从全局模型聚合和本地客户端更新中的一个角度进行改进,难免会引发另一角度带来的影响,降低全局模型的质量.提出一种分层持续学习的联邦学习优化方法(FedMas).FedMas基于分层融合的思想,首先,采用客户端分层策略,利用DBSCAN算法将相似数据分布的客户端划分到不同的层中,每次仅挑选某个层的部分客户端进行训练,避免服务器端全局模型聚合时因数据分布不同产生的权重分歧现象;进一步,由于每个层的数据分布不同,客户端在局部更新时结合持续学习灾难性遗忘的解决方案,有效地融合不同层客户端数据间的差异性,从而保证全局模型的性能.在MNIST和CIFAR-10标准数据集上的实验结果表明,FedMas与FedProx、Scaffold和FedCurv联邦学习算法相比,全局模型测试准确率平均提高0.3～2.2个百分点.

外文标题：Federated Learning Optimization Method in Non-IID Scenarios

外文摘要：Federated Learning(FL)can collaborate to train global models without compromising data privacy.Nonetheless,this collaborative training approach faces the challenge of Non-IID in the real world;slow model convergence and low accuracy.Numerous existing FL methods improve only from one perspective of global model aggregation and local client update,and inevitably will not cause the impact of the other perspective and reduce the quality of the global model.In this context,we introduce a hierarchical continuous learning optimization method for FL,denoted as FedMas,which is based on the idea of hierarchical fusion.First,clients with similar data distribution are divided into different layers using the DBSCAN algorithm,and only part of clients of a certain layer are selected for training each time to avoid weight differences caused by different data distributions when the server global model is aggregated.Further,owing to the different data distributions of each layer,the client combines the solution of continuous learning catastrophic forgetting during local update to effectively integrate the differences between the data of different layers of clients,thus ensuring the performance of the global model.Experiments on MNIST and CIFAR-10 standard datasets demonstrate that the global model test accuracy is improved by 0.3-2.2 percentage points on average compared with FedProx,Scaffold,and FedCurv FL algorithms.

外文关键词：

Federated Learning(FL)continual learningdata heterogeneityclusteringhierarchical optimizationdata distribution

作者：

宋华伟、李升起、万方杰、卫玉萍

展开 >

作者单位：

郑州大学网络空间安全学院,河南郑州 450000

关键词：

联邦学习持续学习数据异构聚类分层优化数据分布

基金：

河南省重大科技专项

项目编号：

221100210100

出版年：

2024

DOI：

10.19678/j.issn.1000-3428.0067791

计算机工程

华东计算技术研究所　上海市计算机学会

计算机工程

CSTPCD北大核心

影响因子：0.581

ISSN：1000-3428

年,卷(期)：2024.50(3)

参考文献量27