首页|基于改进SMOTE的不平衡数据分类算法

基于改进SMOTE的不平衡数据分类算法

扫码查看
SMOTE算法是处理不平衡数据的一种经典的过采样算法,文中对该算法进行改进.首先采用k-means算法对原始数据进行聚类,利用类判别函数对聚类样本进行筛选,筛选出"安全样本".然后利用新的过采样率对"安全样本"进行线性插值,并且在插值过程中采用LMKNN方法.分别将该算法与 SMOTE、KNSMOTE 应用至实际数据中,使用 SVM分类算法分类并进行性能对比.结果表明,对Abalone、Ecoli 等不平衡数据集分类时,文中使用的算法分类效果最佳,验证了该算法的有效性.
Unbalanced data classification algorithm based on improved SMOTE
The SMOTE algorithm is a classic oversampling algorithm for handling imbalanced data,and this article improves it.Firstly,the k-means algorithm is used to cluster the original dataset.Use the class discriminant function to filter the clustering samples and select"safe samples".Finally,a new oversampling rate is used to linearly interpolate the"safe samples",and the LMKNN method is used during the interpolation process.This algorithm was applied to imbalanced datasets separately with SMOTE and KNSMOTE,and the classification performance was compared using SVM classification algorithm.The results show that the algorithm used in this paper has better classification performance in imbalanced datasets such as Abalone and Ecoli,verifying the effectiveness of the algorithm.

imbalanced dataSMOTE algorithmSVM algorithm

马宝霖、胡茜

展开 >

长春工业大学 数学与统计学院,吉林 长春 130012

不平衡数据 SMOTE算法 SVM算法

吉林省重大科技专项

20210301038GX

2024

长春工业大学学报
长春工业大学

长春工业大学学报

影响因子:0.282
ISSN:1674-1374
年,卷(期):2024.45(3)