首页|基于Edge-TB的联邦学习中客户端选择策略和数据集划分研究

基于Edge-TB的联邦学习中客户端选择策略和数据集划分研究

扫码查看
联邦学习是分布式机器学习在现实中的应用之一.针对联邦学习中的异构性,基于FedProx算法,提出优先选择近端项较大的客户端选择策略,效果优于常见的选择局部损失值较大的客户端选择策略,可以有效提高FedProx算法在异构数据和系统下的收敛速度,提高有限聚合次数内的准确率.针对联邦学习数据异构的假设,设计了一套异构数据划分流程,得到了基于真实图像数据集的异构联邦数据集作为实验数据集.使用开源的分布式机器学习框架Edge-TB作为实验测试平台,以异构划分后的Cifar10作为数据集,实验表明,采用新的客户端选择策略的改进FedProx算法较原算法在有限的聚合轮数内准确率提升14.96%,通信开销减小6.3%;与SCAFFOLD算法相比,准确率提升3.6%,通信开销减小51.7%,训练时间减少15.4%.
Study on Client Selection Strategy and Dataset Partition in Federated Learning Based on Edge TB
Federated learning is one of the applications of distributed machine learning in reality.In view of the heterogeneity in Federated learning,based on FedProx algorithm,this paper proposes a client selection strategy that preferentially selects the client with large near end items.The effect is better than the common client selection strategy that selects the client with large lo-cal loss value,which can effectively improve the Rate of convergence of FedProx algorithm under heterogeneous data and sys-tems,and improve the accuracy within limited aggregation times.According to the hypothesis of heterogeneous data in federated learning,a set of heterogeneous data partition process is designed,and the heterogeneous federated dataset based on the real image dataset is obtained as the experimental dataset.Using the open-source distributed machine learning framework Edge-TB as the experimental testing platform and the heterogeneous partitioned Cifar10 as the dataset,the experiment proves that,using the new client selection strategy,the accuracy of the improved FedProx algorithm improves by 14.96%,and the communication overhead reduces by 6.3%compared to the original algorithm in a limited number of aggregation round.Compared with the SCAFFOLD algorithm,the accuracy is improved by 3.6%,communication overhead is reduced by 51.7%,and training time is reduced by 15.4%.

Distributed machine learningFederated learningOptimization algorithmRegularizationProximal term

周天阳、杨磊

展开 >

华南理工大学软件学院 广州 510006

分布式机器学习 联邦学习 优化算法 正则化 近端项

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(z1)
  • 18