计算机应用研究2025,Vol.42Issue(1) :156-164.DOI:10.19734/j.issn.1001-3695.2024.06.0195

面向类不平衡和重叠的工控数据异常检测的半监督欠采样方法

Semi-supervised under-sampling method for anomaly detection of industrial control data with class imbalance and overlap

顾兆军 扬雪影 隋翯 张一诺
计算机应用研究2025,Vol.42Issue(1) :156-164.DOI:10.19734/j.issn.1001-3695.2024.06.0195

面向类不平衡和重叠的工控数据异常检测的半监督欠采样方法

Semi-supervised under-sampling method for anomaly detection of industrial control data with class imbalance and overlap

顾兆军 1扬雪影 2隋翯 3张一诺2
扫码查看

作者信息

  • 1. 中国民航大学信息安全中心,天津 300300
  • 2. 中国民航大学信息安全中心,天津 300300;中国民航大学计算机科学与技术学院,天津 300300
  • 3. 中国民航大学航空工程学院,天津 300300
  • 折叠

摘要

工业控制系统异常检测面临着数据缺乏标签信息、类不平衡和类重叠的耦合问题,导致现有的分类器难以精准检测异常数据.现有的数据级采样方法在打伪标签、数据平衡或检测重叠区域时存在着打伪标签结果不准确、采样效果稳定性差以及重叠识别率低等问题.为此,提出一种基于半监督学习的欠采样方法(SSLU-LP).该方法通过异构集成将标签传播机制和单类分类器结合,补充数据伪标签;利用最小生成树策略构建重叠区域检测模型;采用欠采样策略,通过最近邻搜索有选择性地去除部分多数类样本.最后该方法与四种经典分类器结合,在九个工控数据集上与九种混合算法进行比较.实验结果表明,所提方法可以精准地为无标签数据打伪标签,高效且有效检测出不平衡数据集中的重叠数据,改善了分类器的训练效果,提高了分类器的异常检测性能.

Abstract

Anomaly detection in industrial control systems faces challenges such as lack of label information,class imbalance,and class overlap,which hinder existing classifiers from accurately detecting anomalies.Current data-level sampling methods suffer from inaccurate pseudo-labeling,poor sampling stability,and low overlap detection rates.Therefore,this paper pro-posed an undersampling method based on semi-supervised learning(SSLU-LP).This method combined the label propagation mechanism with a single class classifier through heterogeneous integration to supplement pseudo-labels.It constructed an over-lap region detection model using the minimum spanning tree strategy and employed an undersampling strategy to selectively re-move some majority class samples via nearest neighbor search.Finally,this paper combined the proposed method with 4 classi-cal classifiers and compared it with 9 hybrid algorithms on 9 industrial control datasets.Experimental results show that the pro-posed method can accurately pseudo-label unlabeled data,efficiently and effectively detect overlapping data in unbalanced datasets,improve the classifier's training performance,and enhance its anomaly detection capabilities.

关键词

工业控制系统/类不平衡/类重叠/半监督学习/异常检测

Key words

industrial control system/class imbalance/class overlap/semi-supervised learning/anomaly detection

引用本文复制引用

出版年

2025
计算机应用研究
四川省电子计算机应用研究中心

计算机应用研究

CSTPCDCSCD北大核心
影响因子:0.93
ISSN:1001-3695
段落导航相关论文