Boosted equalization ensemble learning algorithm for imbalanced data
In order to effectively solve the pseudo-balancing problem of the under-sampling technique in dealing with imbalanced data,a boosted equalization ensemble learning algorithm based on under-sampling was proposed.A new equalization sampling mechanism was used to train the classifier iteratively by coordinating the prediction probabilities of the data through the binning operation,so a high-quality training subset could be generated.Based on the false-positive and false-negative rates of the base classifiers on the original data,weights were assigned adaptively to them during the iterative process,so as to avoid poorly performing classifiers from influencing the overall decision and to improve the generalization ability of the ensemble model.The new algorithm was able to increase the recognition of majority class samples while eliminating pseudo-balancing,thus reducing the impact of boundary ambiguity on the classification model.Comparative experiments with 18 sets of small datasets and 2 sets of large datasets showed that the algorithm had the advantage of dealing with imbalanced data classification problems.
under-samplingclass imbalanceimbalance learningensemble learningimbalanced data classification