SVM样本约简算法研究综述

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：支持向量机(Support Vector Machine,SVM)是基于统计学习理论和结构风险最小化原则发展起来的一种有监督的机器学习算法,它有效克服了局部最小和维数灾难等问题,具有良好的泛化性能,并被广泛应用于模式识别和人工智能领域.但SVM的学习效率随着训练样本数量的增加而显著降低,对于大规模训练集,采用标准优化方法的传统SVM面临着内存需求过大、执行速度慢,有时甚至无法执行的问题.为了缓解SVM在大规模训练集上存储需求高、训练时间长等问题,学者们提出了SVM样本约简算法.文中首先介绍了SVM理论基础,然后从基于聚类、几何分析、主动学习、增量学习和随机抽样5个方面系统综述了SVM样本约简算法的研究现状,讨论了各种SVM样本约简算法的优缺点,最后总结全文并展望未来.

外文标题：Overview of Sample Reduction Algorithms for Support Vector Machine

外文摘要：Support vector machine(SVM)is a supervised machine learning algorithm developed based on statistical learning theo-ry and the principle of structural risk minimization,which effectively overcomes the problems of local minimum and curse of di-mensionality and has good generalization performance.SVM has been widely used in the fields of pattern recognition and artificial intelligence.However,the learning efficiency of SVM decreases significantly with the increase of the number of training samples.For large-scale training datasets,the traditional SVM with standard optimization methods will be confronted with the problems of excessive memory requirements,slow training speed,and sometimes even being unable to execute.To alleviate the problems of high storage requirements and long training time of SVM on large-scale training sets,scholars have proposed SVM sample reduc-tion algorithms.This paper firstly introduces the theoretical basis of the SVM and then systematically reviews the current re-search status of the SVM sample reduction algorithms from five aspects based on clustering,geometric analysis,active learning,incremental learning and random sampling,respectively.And it discusses the advantages and disadvantages of these algorithms,and finally presents an outlook on the future research of the SVM sample reduction methods.

外文关键词：

Support vector machineLarge-scale data setSample reductionMachine learningClassification

作者：

张代俐、汪廷华、朱兴淋

展开 >

作者单位：

赣南师范大学数学与计算机科学学院江西赣州 341000

关键词：

支持向量机大规模数据集样本约简机器学习分类

基金：

国家自然科学基金江西省研究生创新专项资金

项目编号：

61966002YC2022-s944

出版年：

2024

DOI：

10.11896/jsjkx.230400143

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCD北大核心

影响因子：0.944

ISSN：1002-137X

年,卷(期)：2024.51(7)

被引量1
参考文献量3