Study on Client Selection Strategy and Dataset Partition in Federated Learning Based on Edge TB
Federated learning is one of the applications of distributed machine learning in reality.In view of the heterogeneity in Federated learning,based on FedProx algorithm,this paper proposes a client selection strategy that preferentially selects the client with large near end items.The effect is better than the common client selection strategy that selects the client with large lo-cal loss value,which can effectively improve the Rate of convergence of FedProx algorithm under heterogeneous data and sys-tems,and improve the accuracy within limited aggregation times.According to the hypothesis of heterogeneous data in federated learning,a set of heterogeneous data partition process is designed,and the heterogeneous federated dataset based on the real image dataset is obtained as the experimental dataset.Using the open-source distributed machine learning framework Edge-TB as the experimental testing platform and the heterogeneous partitioned Cifar10 as the dataset,the experiment proves that,using the new client selection strategy,the accuracy of the improved FedProx algorithm improves by 14.96%,and the communication overhead reduces by 6.3%compared to the original algorithm in a limited number of aggregation round.Compared with the SCAFFOLD algorithm,the accuracy is improved by 3.6%,communication overhead is reduced by 51.7%,and training time is reduced by 15.4%.
Distributed machine learningFederated learningOptimization algorithmRegularizationProximal term