首页期刊导航|ACM Transactions on Modeling and Performance Evaluation of Computing Systems
期刊信息/Journal information
ACM Transactions on Modeling and Performance Evaluation of Computing Systems
Association for Computing Machinery
Association for Computing Machinery
季刊
2376-3639
ACM Transactions on Modeling and Performance Evaluation of Computing Systems/Journal ACM Transactions on Modeling and Performance Evaluation of Computing SystemsEIESCI
查看更多>>摘要:Federated learning has recently emerged as a popular approach to training machine learning modelson data that is scattered across multiple heterogeneous devices, often referred to as “clients” in afederated learning system. These clients iteratively compute updates to the machine learning modelson their local datasets. These updates are periodically aggregated across clients, typically butnot always with the help of a parameter server. The aggregated model then serves as the startingpoint for new rounds of client updates. In many real-world applications such as connected-andautonomousvehicles (CAVs), the underlying distributed/decentralized systems on which federatedlearning algorithms are executing suffer a wide degree of heterogeneity including but not limitedto data distributions, computation speeds, and external local environments. Moreover, the clientsare often resource-constrained edge or end devices and may compete for common resources suchas communication bandwidth. Such heterogeneity raises significant research questions on howthese systems will perform under different variants of federated learning algorithms.
查看更多>>摘要:Communication overhead is a main bottleneck in federated learning (FL) especially in the wirelessenvironment due to the limited data rate and unstable radio channels. The communication challengenecessitates holistic selection of participating clients that accounts for both the computation needs andcommunication cost, as well as judicious allocation of the limited transmission resource. Meanwhile, therandom unpredictable nature of both the training data samples and the communication channels requires anonline optimization approach that adapts to the changing system state over time. In this work, we consider ageneral framework of online joint client sampling and power allocation for wireless FL under time-varyingcommunication channels. We formulate it as a stochastic network optimization problem that admits aLyapunov-typed solution approach. This leads to per-training-round subproblems with a special bi-convexstructure, which we leverage to propose globally optimal solutions, culminating in a meta algorithm thatprovides strong performance guarantees. We further study three specific FL problems covering multiplescenarios, namely, with IID or non-IID data, whether robustness against data drift is required, and withunbiased or biased client sampling.We derive detailed algorithms for each of these problems. Simulation withstandard classification tasks demonstrate that the proposed communication-aware algorithms outperformtheir counterparts under a wide range of learning and communication scenarios.
MARYAM BEN DRISSESSAID SABIRHALIMA ELBIAZEABDOULAYE DIALLO...
3.1-3.24页
查看更多>>摘要:Federated Learning (FL) has gained attention across various industries for its capability to train machine learningmodels without centralizing sensitive data. While this approach offers significant benefits such as privacypreservation and decreased communication overhead, it presents several challenges, including deploymentcomplexity and interoperability issues, particularly in heterogeneous scenarios or resource-constrained environments.Over-the-air (OTA) FL was introduced to tackle these challenges by disseminating model updateswithout necessitating direct device-to-device connections or centralized servers. However, OTA-FL broughtforth limitations associated with heightened energy consumption and network latency. In this article, wepropose a multi-attribute client selection framework employing the grey wolf optimizer (GWO) to strategicallycontrol the number of participants in each round and optimize the OTA-FL process while consideringaccuracy, energy, delay, reliability, and fairness constraints of participating devices. We evaluate the performanceof our multi-attribute client selection approach in terms of model loss minimization, convergence timereduction, and energy efficiency. In our experimental evaluation, we assessed and compared the performanceof our approach against the existing state-of-the-art methods. Our results demonstrate that the proposedGWO-based client selection outperforms these baselines across various metrics. Specifically, our approachachieves a notable reduction in model loss, accelerates convergence time, and enhances energy efficiencywhile maintaining high fairness and reliability indicators.
查看更多>>摘要:Growing concerns about centralized mining of personal data threatens to stifle further proliferation ofmachine learning (ML) applications. Consequently, a recent trend in ML training advocates for a paradigmshift – moving the computation of ML models from a centralized server to a federation of edge devicesowned by the users whose data is to be mined. Though such decentralization aims to alleviate concernsrelated to raw data sharing, it introduces a set of challenges due to the hardware heterogeneity among thedevices possessing the data. The heterogeneity may, in the most extreme cases, impede the participation oflow-end devices in the training or even prevent the deployment of the ML model to such devices.Recent research in distributed collaborative machine learning (DCML) promises to address the issue of MLmodel training over heterogeneous devices. However, the actual extent to which the issue is solved remainsunclear, especially as an independent investigation of the proposed methods’ performance in realisticsettings is missing. In this paper, we present a detailed survey and an evaluation of algorithms that aim toenable collaborative model training across diverse devices. We explore approaches that harness three majorstrategies for DCML, namely Knowledge Distillation, Split Learning, and Partial Training, and we conduct athorough experimental evaluation of these approaches on a real-world testbed of 14 heterogeneous devices.Our analysis compares algorithms based on the resulting model accuracy, memory consumption, CPUutilization, network activity, and other relevant metrics, and provides guidelines for practitioners as well aspointers for future research in DCML.
查看更多>>摘要:Federated learning (FL) is a promising technique for decentralized privacy-preserving Machine Learning (ML)with a diverse pool of participating devices with varying device capabilities. However, existing approachesto handle such heterogeneous environments do not consider “fairness” in model aggregation, resulting insignificant performance variation among devices. Meanwhile, prior works on FL fairness remain hardwareobliviousand cannot be applied directly without severe performance penalties. To address this issue, wepropose a novel hardware-sensitive FL method called FairHetero that promotes fairness among heterogeneousfederated clients. Our approach offers tunable fairness within a group of devices with the same MLarchitecture as well as across different groups with heterogeneous models. Our evaluation under MNIST,FEMNIST, CIFAR10, and SHAKESPEARE datasets demonstrates that FairHetero can reduce variance among participatingclients’ test loss compared to the existing state-of-the-art techniques, resulting in increased overallperformance.
THOMAS SANDHOLMSAYANDEV MUKHERJEEBERNARDO HUBERMAN
6.1-6.28页
查看更多>>摘要:We propose, analyze, and experimentally evaluate a novel secure aggregation algorithm targeted at crossorganizationalfederated learning applications with a fixed set of participating learners. Our solution organizeslearners in a chain and encrypts all traffic to reduce the controller of the aggregation to a mere messagebroker. We show that our algorithm scales better and is less resource demanding than existing solutions,while being easy to implement on constrained platforms.With 36 nodes, our method outperforms state-of-the-art secure aggregation by 70x, and 56x with andwithout failover, respectively.