首页|一种基于样本分布的分类方法研究

一种基于样本分布的分类方法研究

扫码查看
一般而言,KNN分类器和邻域分类器在样本分布均匀的情况下分类性能良好,但在实际应用中,由于在获取数据时,样本易受到技术、环境等客观因素的影响,从而导致其分布是非均匀的。而样本分布的非均匀往往会使得两个分类器的分类性都不能达到令人满意的效果。为了解决这一问题,论文基于KNN分类器和邻域分类器,设计了k邻域分类器。该分类器的核心思想是判断待测样本所处局部区域内的样本分布情况,进而通过限制高密度区域内样本数量和缩小低密度区域样本搜索范围的方式,以此达到提高分类准确率的目的。实验中使用了UCI数据库和ORL人脸数据库中共10组数据集,在三种范数下,将k邻域分类器与KNN分类器、邻域分类器进行了比较。实验结果表明,相较于其他两种分类器,k邻域分类器不仅能够提高分类准确率,而且具备较好的分类稳定性。
Research on Sample Distribution Based on Classification Method
Generally speaking,KNN classifier and neighborhood classifier can perform well when the samples are evenly dis-tributed,but in practical application,samples are easily affected by technology,environment and other factors during data acquisi-tion,resulting in uneven distribution.Neither of the two classifications can achieve satisfactory classification results because of the uneven distribution of samples.To deal with this problem,a k-neighborhood classifier is designed based on KNN classifier and neighborhood classifier.The core idea of such classifier is to judge the sample distribution in the local area of the sample to be test-ed,and then to improve the classification accuracy by limiting the number of samples in the high-density area and reducing the search space of samples in the low-density area.The experiments employ three norm on 10 data sets of UCI database and ORL face database,and the k-neighborhood classifier is compared with KNN classifier and neighborhood classifier.The experimental results imply that compared with the other two classifiers,k-neighborhood classifier can not only improve the classification accuracy,but also maintain well-matched classification stability.

classifiersample distributionneighborhoodnorm

陈亮

展开 >

江苏科技大学计算机学院 镇江 212100

分类器 样本分布 邻域 范数

2024

计算机与数字工程
中国船舶重工集团公司第七0九研究所

计算机与数字工程

CSTPCD
影响因子:0.355
ISSN:1672-9722
年,卷(期):2024.52(11)