Overview of Sample Reduction Algorithms for Support Vector Machine
Support vector machine(SVM)is a supervised machine learning algorithm developed based on statistical learning theo-ry and the principle of structural risk minimization,which effectively overcomes the problems of local minimum and curse of di-mensionality and has good generalization performance.SVM has been widely used in the fields of pattern recognition and artificial intelligence.However,the learning efficiency of SVM decreases significantly with the increase of the number of training samples.For large-scale training datasets,the traditional SVM with standard optimization methods will be confronted with the problems of excessive memory requirements,slow training speed,and sometimes even being unable to execute.To alleviate the problems of high storage requirements and long training time of SVM on large-scale training sets,scholars have proposed SVM sample reduc-tion algorithms.This paper firstly introduces the theoretical basis of the SVM and then systematically reviews the current re-search status of the SVM sample reduction algorithms from five aspects based on clustering,geometric analysis,active learning,incremental learning and random sampling,respectively.And it discusses the advantages and disadvantages of these algorithms,and finally presents an outlook on the future research of the SVM sample reduction methods.
Support vector machineLarge-scale data setSample reductionMachine learningClassification