计算机仿真2024,Vol.41Issue(8) :338-343.

改进旋转平衡森林的数据密度峰值聚类算法

Improved Clustering Algorithm for Data Density Peak of Rotating Balanced Forest

衡欣 焦禹淦 郑延斌
计算机仿真2024,Vol.41Issue(8) :338-343.

改进旋转平衡森林的数据密度峰值聚类算法

Improved Clustering Algorithm for Data Density Peak of Rotating Balanced Forest

衡欣 1焦禹淦 2郑延斌3
扫码查看

作者信息

  • 1. 新乡工程学院信息工程学院,河南 新乡 453000
  • 2. 安阳学院计算机科学与数学学院,河南 新乡 453500
  • 3. 河南师范大学计算机与信息工程学院,河南 新乡 453007
  • 折叠

摘要

非平衡数据中少数类样本数量少,存在分类检测准确率低下的问题,为提高少数类检测精度,同时提高分类检测的通用性,将ADPC自适应密度峰值聚类优化算法与ROBF旋转平衡森林算法有机融合,提出一种改进的非平衡数据密度峰值聚类算法,即ROBF-ADPC算法.算法首先采用SMOTE数据采样法,通过合成少数类样本以提高非平衡数据的协方差收缩性,并基于系统参数获取特征子集;然后采用PCA主成分分析法对特征子集进行特征旋转变换,并采用HSLS插值法提高数据集的平衡度;接着通过对样本局部域密度的标准化处理,并在降序图中拉伸"奇点"附近样本;最后利用自适应优化策略完成聚类中心分配,完成非平衡数据分类任务.消融实验结果显示,三类优化模块均对分类结果均起正向影响,且三类优化算法的叠加将少数类分类精确度提升了8.08%,但时效性略有下降;对比实验结果表明,ROBF-ADPC聚类模型在对非平衡数据进行分类时,在三类数据集下,较其余 8 类融合模型相比,少数类分类准确率R平均提高了 5.13%,且系统特异度恒为最大值.综上所述,上述ROBF-ADPC算法模型可以有效的提升非平衡数据集中少数类检测精度,具有重要的仿真价值.

Abstract

In order to improve the accuracy of minority class detection and the generality of classification detec-tion,this paper combines the ADPC adaptive density peak clustering optimization algorithm with the ROBF rotating balanced forest algorithm and proposes an improved density peak clustering algorithm for unbalanced data,namely the ROBF-ADPC algorithm.Firstly,the SMOTE data sampling method is used to improve the covariance shrinkage of un-balanced data by synthesizing minority samples,and the feature subset is obtained based on the system parameters,then the PCA principal component analysis method is used to rotate the feature subset,and the HSLS interpolation method is adopted to improve the balance of the data set.Then,the local density of samples is standardized,and the samples near the"singular point"are stretched in the descending graph.Finally,the adaptive optimization strategy is used to complete the allocation of clustering centers and the task of unbalanced data classification is completed.The results of ablation experiments show that the three types of optimization modules have a positive impact on the classifi-cation results,and the superposition of the three types of optimization algorithms improves the classification accuracy of minority classes by 8.08%,but the timeliness decreases slightly.The comparative experimental results show that the ROBF-ADPC clustering model has an average increase of 5.13% in the minority class classification accuracy R compared with the other eight fusion models in the three types of data sets,and the system specificity is always the maximum.To sum up,the ROBF-ADPC algorithm model constructed in this paper can effectively improve the minori-ty class detection accuracy in unbalanced data sets and has important simulation value.

关键词

旋转平衡森林/密度峰值聚类/数据采样

Key words

Rotate The balance forest/Density peak clustering/Data sampling

引用本文复制引用

基金项目

河南省软科学项目(142400411001)

河南省科技厅科技攻关项目(132102210537)

出版年

2024
计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
段落导航相关论文