Cafe: Improved Federated Data Imputation by Leveraging Missing Data Heterogeneity

扫码查看

原文链接

NETL
NSTL
IEEE

外文摘要：Federated learning (FL), a decentralized machine learning approach, offers great performance while alleviating autonomy and confidentiality concerns. Despite FL’s popularity, how to deal with missing values in a federated manner is not well understood. In this work, we initiate a study of federated imputation of missing values, particularly in complex scenarios, where missing data heterogeneity exists and the state-of-the-art (SOTA) approaches for federated imputation suffer from significant loss in imputation quality. We propose Cafe, a personalized FL approach for missing data imputation. Cafe is inspired from the observation that heterogeneity can induce differences in observable and missing data distribution across clients, and that these differences can be leveraged to improve the imputation quality. Cafe computes personalized weights that are automatically calibrated for the level of heterogeneity, which can remain unknown, to develop personalized imputation models for each client. An extensive empirical evaluation over a variety of settings demonstrates that Cafe matches the performance of SOTA baselines in homogeneous settings while significantly outperforming the baselines in heterogeneous settings.

外文关键词：

ImputationData modelsDistributed databasesHospitalsGlucosePredictive modelsComputational modelingBiological system modelingMathematical modelsProtocols

作者：

Sitao Min、Hafiz Asif、Xinyue Wang、Jaideep Vaidya

展开 >

作者单位：

Rutgers University, Newark, NJ, USA

Rutgers University, Newark, NJ, USA|Hofstra University, Hempstead, NY, USA

Center for Applied Statistics and School of Statistics, Renmin University of China, Beijing, China

出版年：

2025

DOI：

10.1109/TKDE.2025.3537403

IEEE transactions on knowledge and data engineering

ISSN：

年,卷(期)：2025.37(5)

参考文献量79