A MULTI-STRATEGY PROCESS AND ENSEMBLE CLASSIFICATION ALGORITHM FOR IMBALANCED DATA
Aimed at the issue that the traditional machine learning algorithms tend to ignore the minority classes when classifying imbalanced data,a combination classification algorithm based on multi strategy processing named as MsBoost is proposed.In the algorithm,the training data was clustered.The minority classes were oversampled,and the majority classes were under-sampled by using the proposed"three-in-one"algorithm.The different weights were assigned to the samples in different classes.The sampled two classes samples were combined,and the AdaBoost algorithm was used to boost the base learners.MsBoost was compared with AdaBoost,RusBoost,SmoteBoost and CusBoost algorithm on 12 KEEL imbalanced datasets.MsBoost algorithm has achieved 6 times optimal and 2 times suboptimal results in both AUC and G-mean index values and 1 time optimal and 6 times suboptimal in F1-score,which shows that the algorithm can effectively classify imbalanced data.