Research on Oversampling Algorithm Based on OCkNN+ENN
Class imbalance learning(CIL)is one of the hot topics in the field of machine learning(ML).Among the CIL meth-ods,SMOTE is considered as one of the benchmark algorithms.Although the SMOTE algorithm performs well on most of the class imbalance datasets,it has some problems,such as generating noise interference and noise propagation.Based on the study of SMOTE variants,a more robust and general algorithm is proposed,which is ONE-SMOTE.That method can use edited nearest neighbor(ENN)to clean data and filter noise,then use one-class(OCkNN)to detect the relative density distribution information of the sample.And the relative density position and boundary of each sample can be precisely located that will be used for oversam-pling.The experimental results show that the algorithm can effectively improve the accuracy rate of data classification.
class imbalance learningSMOTEENNOCkNNrelative density distribution information