An investigation into the impact of resampling methods for class-imbalanced datasets
In order to evaluate the impact of resampling methods on class-imbalanced datasets,an investigation was conducted using the widely recognized Wisconsin breast cancer diagnosis dataset from the United States.Experiments were carried out based on three machine learning algorithms:Logistic Regression,Support Vector Machine,and Random Forest.Four resampling meth-ods—Random Over-sampling,Random Under-sampling,SMOTE,and ADASYN—were analyzed using F1 scores and AUC values.The experimental results indicate that all four resampling methods can improve model performance,with Random Under-sampling proving to be more effective in handling class-imbalanced datasets.