嵌入和梯度双向压缩的高效纵向联邦学习

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：纵向联邦学习在不泄露数据隐私的前提下,通过联合多方本地数据特征,共同训练目标模型,提高数据利用价值,受到业界公司和机构的广泛关注.在训练过程中,客户端上传的中间嵌入及服务器返回的梯度信息需要巨大的通信量,通信成本成为限制其实际应用的关键瓶颈.如何通过有效的算法设计减少通信量、提高通信效率成为当前研究的热点之一.本文针对纵向联邦学习通信效率问题,提出基于嵌入和梯度双向压缩的高效压缩算法,对客户端上传的嵌入表示,采用改进的稀疏化方法并结合缓存重用机制,对服务器分发的梯度信息,采用离散量化与哈夫曼编码结合的机制.实验结果表明,本文算法能够在准确率与无压缩场景保持相当的前提下,降低约 85%的通信量,提高通信效率,减少整体训练时间.

外文标题：Efficient Vertical Federated Learning Based on Embedding and Gradient Bidirectional Compression

外文摘要：Vertical federated learning improves the value of data utilization by combining local data features from multiple parties and jointly training the target model without leaking data privacy.It has received widespread attention from companies and institutions in the industry.During the training process,the intermediate embeddings uploaded by clients and the gradients returned by the server require a huge amount of communication,and thus the communication cost becomes a key bottleneck limiting the practical application of vertical federated learning.Consequently,current research focuses on designing effective algorithms to reduce the communication amount and improve communication efficiency.To improve the communication efficiency of vertical federated learning,this study proposes an efficient compression algorithm based on embedding and gradient bidirectional compression.For the embedding representation uploaded by the client,an improved sparsification method combined with a cache reuse mechanism is employed.For the gradient information distributed by the server,a mechanism combining discrete quantization and Huffman coding is used.Experimental results show that the proposed algorithm can reduce the communication volume by about 85%,improve communication efficiency,and reduce the overall training time while maintaining almost the same accuracy as the uncompressed scenario.

外文关键词：

vertical federated learning(VFL)communication efficiencyembedding compressiongradient compressionsparsificationquantization

作者：

张宇航、嵩天

展开 >

作者单位：

北京理工大学计算机学院,北京 100081

北京理工大学网络空间安全学院,北京 100081

关键词：

纵向联邦学习通信效率嵌入压缩梯度压缩稀疏化量化

基金：

国家重点研发计划

项目编号：

2022YFC3303500

出版年：

2024

DOI：

10.15888/j.cnki.csa.009656

计算机系统应用

中国科学院软件研究所

计算机系统应用

CSTPCD

影响因子：0.449

ISSN：1003-3254

年,卷(期)：2024.33(10)