首页|面向分布式系统标签噪声的时间序列分类方法

面向分布式系统标签噪声的时间序列分类方法

扫码查看
时间序列数据广泛存在于工业、医疗等应用领域的分布式边缘设备中,由于其往往具备人类不可识别的特征,基于现实数据的时间序列分类任务中普遍存在数据"孤岛"和标注错误等问题.为解决分布式数据环境下这一困难,提出一种联邦时序过滤框架,该框架充分考虑自监督对比学习在提取复杂时序数据表征的优越性,并结合联邦学习方法来解决分布式系统的隐私安全问题,同时降低通信成本.首先,通过在服务器上维护一套基准样本,使用基于区别对比损失和预测对比损失的时序增强预监督策略,通过预训练-微调方法获得一个高泛化时间序列表征能力的预监督模型;然后,引入一种新的标签噪声过滤的方法,利用由预监督模型指导的伪标签与本地标注的标签协同过滤设备中的噪声数据,并将干净数据集用于全局模型的训练;最后,根据各种标签噪声下对框架进行有效性验证,验证不同基准数据比例对于所构造框架的影响,并通过消融实验验证预监督模型各损失的过滤效果.
Time series classification method for distributed system label noise
Distributed edge devices in the industrial,healthcare,and other application fields frequently contain time series data.Due to the often unrecognizable features it possesses,there are common issues in time series classification tasks based on real-world data,such as'data islands'and labeling errors.To address this difficulty in distributed data environments,a federated temporal filtering framework is proposed.It incorporates the advantages of self-supervised contrastive learning in extracting complex temporal data representations and is combined with the federated learning approach to tackle the privacy and security issues of distributed systems,while also reducing the communication cost.By maintaining a set of benchmark samples on the server,this paper employs a time-series augmented pre-supervised strategy that relies on distinguishing contrast loss and predicting contrast loss.A pre-supervised model with a high-capacity for generalizing time-series characterizations is achieved through a pre-training and fine-tuning methodology in this approach.Meanwhile,a new approach for label noise filtering is introduced,which utilizes pseudo-labels guided by the pre-supervised model to filter the noisy data in the device in concert with local dataset labels,and uses the clean dataset for the training of the global model.Finally,this paper validates the framework's effectiveness across different types of labeling noise,examines the impact of varying baseline data ratios on the constructed framework,and confirms the filtering effects of each loss in the pre-supervised model through ablation experiments.

federated learningself-supervised learningtime series classificationlabel noisedistributed system

林子谦、张坤、樊重俊、杨夏洁

展开 >

上海理工大学管理学院,上海 200093

上海财经大学信息管理与工程学院,上海 200433

联邦学习 自监督学习 时间序列分类 标签噪声 分布式系统

2024

控制与决策
东北大学

控制与决策

CSTPCD北大核心
影响因子:1.227
ISSN:1001-0920
年,卷(期):2024.39(12)