A data heterogeneity processing method based on asynchronous hierarchical federated learning
In the era of ubiquitous Internet of Things devices,a vast amount of data with varying dis-tributions and volumes is continuously generated,leading to pervasive data heterogeneity.Addressing the challenges of federated learning for intelligent devices in the IoT landscape,traditional synchronous federated learning mechanisms fall short in effectively tackling the NON-IID data distribution problem.Moreover,they are plagued by issues such as single-point failures and the complexity of maintaining a global clock.However,asynchronous mechanisms may introduce additional communication overhead and obsolescence due to NON-IID data distribution.To offer a more flexible solution to these chal-lenges,an asynchronous hierarchical federated learning method is proposed.Initially,the BIRCH algo-rithm is employed to analyze the data distribution across various IoT nodes,leading to the formation of clusters.Subsequently,data within these clusters is dissected and validated to identify nodes with high data quality.Nodes from high-quality clusters are then disaggregated and reorganized into lower-quality clusters,forming new,optimized clusters.Finally,a two-stage model training is conducted,involving both intra-cluster and global aggregation.Additionally,our proposed approach is evaluated using the MNIST dataset.The results show that,compared to the baseline set by the classical FedAVG method,the proposed approach achieves faster convergence on NON-IID datasets and improves model accuracy by more than 15%.
Internet of Things(IoT)federated learningasynchronous federated learninghierarchi-cal federated learningnon-independent and identically distributed datadata distribution