一种改进的快速K-近邻分类方法

扫码查看

原文链接

NETL
NSTL
万方数据
维普

中文摘要：由于传统K-近邻分类方法需要计算每个待测样本与所有训练样本的距离，学习效率较低。针对这个问题，提出一种改进的快速K-近邻分类方法SK-NN。该方法首先对训练样本采用K-均值方法进行聚类，并得到聚类结果中每个子集的中心和半径，并根据其选择合适的子类并采用该子类对待测样本打标签。由于聚类后得到的子类的规模远小于原始样本的规模，因此需要计算的距离数目减少，提高模型的效率。

外文标题：An Improved Speeding K-Nearest Neighbor Classification Method

外文摘要：In the traditional K-nearest neighbor classification method, for each sample to be tested, it needs to calculate the distance between it and all the training samples, so the time complexity is high. To solve this problem, presents an improved speeding K-NN classification method based on clustering dividing, called SK-NN algorithm. Firstly, the training samples are divided by the K-means clustering, and the train-ing samples are divided into multiple subsets. Then the testing sample is belonged to which cluster by the center and radius, and the test-ing sample is clustered by K-NN on this sub set. The sub set size is smaller than the size of original training sample, so the distances number need to be calculated is decreased and the learning efficiency of model is improved.

外文关键词：

K-Nearest Neighbor ClassificationClusteringSubset

作者：

李伟、程利涛

展开 >

作者单位：

大秦铁路股份有限公司干部培训中心 030013

关键词：

K-近邻分类聚类子集

出版年：

2015

DOI：

10.3969/j.issn.1007-1423.2015.35.003

现代计算机(普及版)

中山大学

现代计算机(普及版)

影响因子：0.202

ISSN：1007-1423

年,卷(期)：2015.(12)

被引量4
参考文献量5