基于优化K-means算法的高校成绩聚类分析研究
Research on Cluster Analysis of College Grades Based on Optimized K-means Algorithm
张梁 1杨立波 1张小勇 1史俊冰1
作者信息
- 1. 太原学院 智能与自动化系,山西 太原 030032
- 折叠
摘要
针对经典 K 均值算法在聚类中心易受异常值影响,导致聚类结果不稳定的问题,提出基于样本分布密度的优化 K-means 算法,以提高聚类稳定性和准确性;聚类后通过 CH 指数和分类区间占比总体两种方法,客观评价 3 种离散化方法,结果表明,优化的 K-means 算法避免了区间分类不合理现象,更加准确地反映了成绩样本的分布特点.
Abstract
In response to the problem of unstability in clustering results that is caused by sus-ceptibility of the classical K-means algorithm in the clustering center to outliers,this paper pro-poses an optimized K-means algorithm based on sample distribution density to improve the stabili-ty and accuracy of clustering.After clustering,the methods of CH index and overall percentage of classification intervals are used to objectively evaluate the three discretization methods.The re-sults show that the optimized K-means algorithm can avoid irrationality of interval classification and reflect distribution characteristics of grade samples more accurately.
关键词
均值算法/分布密度/聚类/K-meansKey words
mean algorithm/distribution density/clustering/K-means引用本文复制引用
基金项目
山西省教学改革创新项目(J20231427)
山西省大学生创新创业训练计划(20231442)
山西大学生创新创业训练计划(20231472)
出版年
2024