首页|面向类不平衡和重叠的工控数据异常检测的半监督欠采样方法

面向类不平衡和重叠的工控数据异常检测的半监督欠采样方法

扫码查看
工业控制系统异常检测面临着数据缺乏标签信息、类不平衡和类重叠的耦合问题,导致现有的分类器难以精准检测异常数据.现有的数据级采样方法在打伪标签、数据平衡或检测重叠区域时存在着打伪标签结果不准确、采样效果稳定性差以及重叠识别率低等问题.为此,提出一种基于半监督学习的欠采样方法(SSLU-LP).该方法通过异构集成将标签传播机制和单类分类器结合,补充数据伪标签;利用最小生成树策略构建重叠区域检测模型;采用欠采样策略,通过最近邻搜索有选择性地去除部分多数类样本.最后该方法与四种经典分类器结合,在九个工控数据集上与九种混合算法进行比较.实验结果表明,所提方法可以精准地为无标签数据打伪标签,高效且有效检测出不平衡数据集中的重叠数据,改善了分类器的训练效果,提高了分类器的异常检测性能.
Semi-supervised under-sampling method for anomaly detection of industrial control data with class imbalance and overlap
Anomaly detection in industrial control systems faces challenges such as lack of label information,class imbalance,and class overlap,which hinder existing classifiers from accurately detecting anomalies.Current data-level sampling methods suffer from inaccurate pseudo-labeling,poor sampling stability,and low overlap detection rates.Therefore,this paper pro-posed an undersampling method based on semi-supervised learning(SSLU-LP).This method combined the label propagation mechanism with a single class classifier through heterogeneous integration to supplement pseudo-labels.It constructed an over-lap region detection model using the minimum spanning tree strategy and employed an undersampling strategy to selectively re-move some majority class samples via nearest neighbor search.Finally,this paper combined the proposed method with 4 classi-cal classifiers and compared it with 9 hybrid algorithms on 9 industrial control datasets.Experimental results show that the pro-posed method can accurately pseudo-label unlabeled data,efficiently and effectively detect overlapping data in unbalanced datasets,improve the classifier's training performance,and enhance its anomaly detection capabilities.

industrial control systemclass imbalanceclass overlapsemi-supervised learninganomaly detection

顾兆军、扬雪影、隋翯、张一诺

展开 >

中国民航大学信息安全中心,天津 300300

中国民航大学计算机科学与技术学院,天津 300300

中国民航大学航空工程学院,天津 300300

工业控制系统 类不平衡 类重叠 半监督学习 异常检测

2025

计算机应用研究
四川省电子计算机应用研究中心

计算机应用研究

北大核心
影响因子:0.93
ISSN:1001-3695
年,卷(期):2025.42(1)