FedRCD:一种基于分布提取与社区检测的聚类联邦学习算法

FedRCD:A Clustering Federated Learning Algorithm Based on Distribution Extraction and Community Detection

王瑞聪 ¹边耐政 ¹吴英俊¹

扫码查看

作者信息

1. 湖南大学信息科学与工程学院长沙 410082
折叠

摘要

将客户端聚类并在簇内进行联邦学习是缓解传统联邦学习算法在非独立同分布(Non-IID)数据场景下表现不佳的一类有效方法.这类方法大多使用客户端本地模型的参数来表征数据特性,并利用参数间的"距离"来评估相似性,从而实现客户端的聚类,但由于神经网络神经元的置换不变性,聚类效果可能会不准确.此外,这类方法通常需要预设聚类数量,可能产生不合理的聚类,或者需要在算法迭代过程中进行聚类,这将带来过大的通信开销.在深入分析了现有方法的缺点之后,提出了一种新颖的联邦学习算法FedRCD.该算法结合了自编码器和K-Means算法,直接从客户端提取数据集的分布信息来描述其特性,从而降低了对模型参数的依赖;FedRCD还将客户端关系组织成图结构,并通过Louvain算法完成客户端聚类关系的构建,这个过程无需预设聚类数量,因此聚类结果更加合理.实验结果表明,FedRCD能更有效地挖掘客户端间的潜在聚类关系,在多种非独立同分布数据场景下,与其他联邦学习算法相比,显著提升了神经网络的训练效果.在CIFAR10数据集上,FedRCD的准确率比经典的FedAvg算法提高了 37.08％,比最新发布的FeSEM算法也提高了 1.89％,同时展现出更优秀的公平性表现.

Abstract

Clustering clients and conducting federated learning within clusters is an effective method to mitigate the poor perfor-mance of traditional federated learning algorithms in non-independently and identically distributed(Non-IID)data scenarios.Such methods primarily utilize the parameters of a client's local model to characterize data features,and evaluate similarity through the"distance"between parameters,thereby realizing client clustering.However,due to the permutation invariance of neurons in neu-ral networks,this could lead to inaccurate clustering results.Moreover,these methods typically require a predetermined number of clusters,which might result in unreasonable clusters,or they may require clustering during the algorithmic iterative process,lea-ding to substantial communication overhead.After in-depth analysis of the shortcomings of existing methods,a novel federated learning algorithm named FedRCD is proposed.This algorithm combines autoencoders and K-Means algorithms,directly extrac-ting distribution information from a client's dataset to represent its characteristics,thereby reducing reliance on model parame-ters.FedRCD also organizes the relationships between clients into a graph structure,and employs the Louvain algorithm to con-struct client clustering relationships.This process does not require pre-setting the number of clusters,which makes the clustering results more reasonable.Experimental results show that FedRCD can more effectively unearth latent clustering relationships be-tween clients.In a variety of Non-IID data scenarios,compared to other federated learning algorithms,it significantly improves the training effect of neural networks.On the CIFAR10 dataset,the accuracy of FedRCD surpasses the classical FedAvg algorithm by 37.08％,and even outperforms the newly released FeSEM algorithm by 1.89％,demonstrating superior fairness performance.

关键词

联邦学习/非独立同分布/分布提取/社区检测/Louvain算法

Key words

Federated learning/Non-IID/Distribution extraction/Community detection/Louvain algorithm

引用本文复制引用

出版年

2025

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

北大核心

影响因子：0.944

ISSN：1002-137X

段落导航