首页|基于对比学习的时间序列聚类方法

基于对比学习的时间序列聚类方法

扫码查看
现有深度聚类方法严重依赖于复杂的特征提取网络和聚类算法,难以直观地定义时间序列的相似性.使用对比学习的方法可以从正负样本数据的角度定义时间序列的区间相似性,并对特征提取和聚类进行联合优化.基于对比学习的思想,提出了一种不依赖于复杂表示网络的时间序列聚类模型.同时,为解决现有时间序列数据增强方法难以描述时间序列的变换不变性的问题,提出了一种基于时间序列形状特征的数据增强方法,在忽略数据时域特征情况下捕捉序列的相似性.模型通过设置不同的形状转换参数构造正负样本对,学习特征表示并投影到特征空间,在实例级对比和聚类级对比层面利用交叉熵损失最大化正样本对相似性,最小化负样本对相似性,实现了端到端的联合学习表示和聚类分配.在32个UCR中的数据集上进行了大量实验,结果表明该模型可以在不依赖于特定表示学习网络的情况下得到与现有方法相当或优于现有方法的聚类结果.
Time Series Clustering Method Based on Contrastive Learning
It is difficult to intuitively define the similarity between time series by deep clustering methods which rely heavily on complex feature extraction networks and clustering algorithms.Contrastive learning can define the interval similarity of time se-ries from the perspective of positive and negative sample data and jointly optimize feature extraction and clustering.Based on the contrastive learning,this paper proposes a time series clustering model that does not rely on complex representation networks.In order to solve the problem that the existing time series data enhancement methods cannot describe the transformation invariance of time series,this paper proposes a new data enhancement method that captures the similarity of sequences while ignoring the time domain characteristics of data.The proposed clustering model constructs positive and negative sample pairs by setting diffe-rent shape transformation parameters,learns feature representation,and uses cross-entropy loss to maximize the similarity of pos-itive sample pairs and minimize negative sample pairs at the instance-level and cluster-level comparison.The proposed model can jointly learn feature representation and cluster assignment in end-to-end fashion.Extensive experiments on 32 datasets in UCR show that the proposed model can obtain equal or better performance than existing methods without relying on a specific repre-sentation learning network.

Time series clusteringContrastive learningData enhancementRepresentation learningJointly optimization

杨博、罗嘉琛、宋艳涛、吴宏涛、彭甫镕

展开 >

山西大学大数据科学与产业研究院 太原 030006

山西大学计算机与信息技术学院 太原 030006

山西省交通科技研发有限公司 太原 030006

时间序列聚类 对比学习 数据增强 表示学习 联合优化

国家自然科学基金山西省重点研发计划山西省基础研究计划山西省基础研究计划南京市国际联合研发项目

62276162202102070301019201901D211170202103021223464202002021

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(2)
  • 41