首页|基于Relief算法的不平衡数据分类分级算法仿真

基于Relief算法的不平衡数据分类分级算法仿真

扫码查看
不平衡数据的分类分级是保证大数据技术高效使用过程中不可缺少的环节,但分类分级过程易受数据属性、冗余性、不均衡性等问题的干扰。为解决上述问题,提出不平衡数据朴素贝叶斯分类分级算法。采用合成少数类过采样技术降低数据的不平衡度,通过距离相关系数与最大信息系数完成不平衡数据的特征选择与筛选,采用Relief算法对筛选的特征做权重分配,并输入到朴素贝叶斯模型中实现分类,再结合动态阈值算法完成数据的分级。实验结果表明,所提算法的运行时间短、分类精度高,能够有效提升数据处理效果。
Simulation of Unbalanced Data Classification and Grading Algorithm Based on Relief Algorithm
In general,the classification of imbalanced data is an indispensable step in ensuring the efficient use of big data technology.However,the classification process is easily disrupted by some issues such as data attributes,redundancy and imbalance.To address this,a hierarchical algorithm for imbalanced data based on naive Bayes was proposed.Firstly,the synthetic minority oversampling technique was employed to reduce the degree of data imbalance.Then,the distance coefficient and the maximum information coefficient were used to carry out the feature selection and screening of imbalanced data.Following this,the Relief algorithm was used to assign weights of the selected features,which were then input into the naive Bayes model for classification.Finally,a dynamic threshold algorithm was utilized to complete the data classification.Experimental results prove that the proposed algorithm has a short running time and high classification accuracy,thus effectively improving data processing performance.

Unbalance degreeDistance coefficientFeature matrixWeight allocationPosterior probability

梁丹凝、梁坚

展开 >

海南医学院生物医学信息与工程学院,海南 海口 570000

东华理工大学理学院,江西 南昌 330013

不平衡度 距离相关系数 特征矩阵 权重分配 后验概率

2024

计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
年,卷(期):2024.41(6)