Semi-supervised under-sampling method for anomaly detection of industrial control data with class imbalance and overlap
Anomaly detection in industrial control systems faces challenges such as lack of label information,class imbalance,and class overlap,which hinder existing classifiers from accurately detecting anomalies.Current data-level sampling methods suffer from inaccurate pseudo-labeling,poor sampling stability,and low overlap detection rates.Therefore,this paper pro-posed an undersampling method based on semi-supervised learning(SSLU-LP).This method combined the label propagation mechanism with a single class classifier through heterogeneous integration to supplement pseudo-labels.It constructed an over-lap region detection model using the minimum spanning tree strategy and employed an undersampling strategy to selectively re-move some majority class samples via nearest neighbor search.Finally,this paper combined the proposed method with 4 classi-cal classifiers and compared it with 9 hybrid algorithms on 9 industrial control datasets.Experimental results show that the pro-posed method can accurately pseudo-label unlabeled data,efficiently and effectively detect overlapping data in unbalanced datasets,improve the classifier's training performance,and enhance its anomaly detection capabilities.
industrial control systemclass imbalanceclass overlapsemi-supervised learninganomaly detection