联邦学习在高度数据异构场景下的泛化鲁棒性增强
Enhancing generalization robustness of federated learning in highly heterogeneous environments
万伟 1胡胜山 1陆建荣 1李明慧 2周子淇 1金海3
作者信息
- 1. 华中科技大学网络空间安全学院,武汉 430074;大数据技术与系统国家地方联合工程研究中心,武汉 430074;服务计算技术与系统教育部重点实验室,武汉 430074;分布式系统安全湖北省重点实验室,武汉 430074;湖北省大数据安全工程技术研究中心,武汉 430074
- 2. 华中科技大学软件学院,武汉 430074
- 3. 华中科技大学计算机科学与技术学院,武汉 430074;大数据技术与系统国家地方联合工程研究中心,武汉 430074;服务计算技术与系统教育部重点实验室,武汉 430074;集群与网格计算湖北省重点实验室,武汉 430074
- 折叠
摘要
联邦学习(federated learning,FL)是一种以保护客户隐私数据为中心的分布式处理网络,为解决隐私泄露问题提供了前景良好的解决方案.然而,FL的一个主要困境是高度非独立同分布(non-independent and identically distributed,non-ⅡD)的数据会导致全局模型性能很差.尽管相关研究已经探讨了这个问题,但本文发现当面对non-ⅡD数据、不稳定的客户端参与以及深度模型时,现有方案和标准基线FedAvg相比,只有微弱的优势或甚至更差,因此严重阻碍了 FL的隐私保护应用价值.为解决这个问题,本文提出了 一种对non-ⅡD数据鲁棒的优化方案:FedUp.该方案在保留FL隐私保护特点的前提下,进一步提升了全局模型的泛化鲁棒性.FedUp的核心思路是最小化全局经验损失函数的上限来保证模型具有低的泛化误差.大量仿真实验表明,FedUp 显著优于现有方案,并对高度non-ⅡD数据以及不稳定和大规模客户端的参与具有鲁棒性.
Abstract
Federated learning(FL)is a distributed processing network that focuses on protecting client privacy data,providing a promising solution for addressing privacy leakage issues.However,a major quagmire in FL is to train clients'models over significantly non-independent and identically distributed(non-ⅡD)data,which would lead to a low-performance global model.Although this issue has been investigated by many previous works,this paper finds that they have little or no performance improvement over the standard baseline FedAvg when facing highly non-ⅡD data,unstable client participation,and deep models,seriously hindering the privacy protection application value of FL.To address this issue,a new solution called FedUp has been proposed.FedUp is a robust optimization solution for non-ⅡD FL that improves the generalization robustness of the global model while retaining the privacy protection characteristics of FL.FedUp minimizes the upper bound of the global empirical loss function to ensure that the models exhibit smaller generalization errors.Simulation experiments show that FedUp achieves significant advantages over state-of-the-art methods,and is robust to highly non-ⅡD data as well as unstable and large-cohort client participation.This solution has the potential to improve the performance of FL and make it more practical for privacy protection applications.
关键词
分布式网络/联邦学习/异构优化/泛化性/鲁棒性/隐私保护Key words
distributed network/federated learning/heterogeneous optimization/generalization/robustness/privacy protection引用本文复制引用
基金项目
国家自然科学基金(U20A20177)
湖北省技术创新计划重点研发专项(2021BAA032)
出版年
2024