传统机器学习模型的超参数优化技术评估

扫码查看

原文链接

万方数据
维普

中文摘要：合理的超参数能够保证机器学习模型适应不同背景和不同任务.为了避免在模型超参数数量过多、搜索空间过大的情况下出现手动调节导致的效率低下问题,多种超参数优化技术已经被研发并运用到机器学习模型训练中.文中首先回顾了8种常见的超参数优化技术,即网格搜索、随机搜索、贝叶斯优化、Hyperband、BOHB、遗传算法、粒子群优化算法和协方差矩阵自适应进化策略,并从时间性能、最终结果、并行能力、可拓展性、稳健性和灵活性5个方面分析各类方法的优缺点.其次,将8种方法应用到LightGBM、XGBoost、随机森林和KNN这4种传统机器学习模型上,在4个基准数据集上完成了回归、二分类和多分类的实验,对各类方法进行了比较.最后总结了各类方法的优缺点,给出了不同方法的适用情景.

外文标题：Evaluation of Hyperparameter Optimization Techniques for Traditional Machine Learning Models

外文摘要：Reasonable hyperparameters ensure that machine learning models can adapt to different backgrounds and tasks.In or-der to avoid the inefficiency caused by manual adjustment of a large number of model hyperparameters and a vast search space,various hyperparameter optimization techniques have been developed and applied in machine learning model training.At first,Pa-per reviews eight common hyperparameter optimization techniques:grid search,random search,Bayesian optimization,Hyper-band,Bayesian optimization and Hyperband(BOHB),genetic algorithms,particle swarm optimization algorithm,and covariance matrix adaptation evolutionary strategy(CMA-ES).The advantages and disadvantages of these methods are analyzed from five as-pects:time performance,final results,parallel capability,scalability,robustness and flexibility.Subsequently,these eight methods are applied to four traditional machine learning models:LightGBM,XGBoost,Random Forest,and K-Nearest Neighbors(KNN).Regression,binary classification and multi-classification experiments are performed on four standard datasets:Boston house price dataset,kin8nm power arm dataset,credit card default customer dataset and handwritten digit dataset.Different methods are com-pared by evaluating their performance using output evaluation metrics.Finally,pros and cons of each method and are summarized,and the application scenarios of different methods are given.The results highlight the importance of selecting appropriate hyper-parameter optimization methods to enhance the efficiency and effectiveness of machine learning model training.

外文关键词：

Traditional machine learningHyperparameter optimizationBayesian optimizationMulti-fidelity technologyMeta-heuristic algorithms

作者：

李海霞、宋丹蕾、孔佳宁、宋亚飞、常海艳

展开 >

作者单位：

北方自动控制技术研究所太原 030006

西安交通大学数学与统计学院西安 710049

空军工程大学防空反导学院西安 710051

关键词：

传统机器学习超参数优化贝叶斯优化多保真技术元启发式算法

基金：

国家自然科学基金国家自然科学基金陕西省高校科协青年人才托举计划陕西省创新能力支撑计划

项目编号：

6180621961876189202201062020KJXX-065

出版年：

2024

DOI：

10.11896/jsjkx.230600164

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCD北大核心

影响因子：0.944

ISSN：1002-137X

年,卷(期)：2024.51(8)

参考文献量1