基于超参数优化集成学习的出行方式选择研究

Travel Mode Choice Based on Hyperparameter Optimization and Ensemble Learning

扫码查看

原文链接

维普
万方数据

中文摘要：为解决传统出行方式选择模型和机器学习模型存在的识别精度不高、超参数优化复杂,以及模型可解释性弱等问题,本文分别采用遗传算法和贝叶斯优化对极限梯度提升机模型进行超参数寻优,进一步融合SHAP(SHapley Additive exPlanations)模型可视化出行方式属性和个体特征对选择概率的非线性关系,采用5折交叉验证的方式训练,避免过拟合.最终,结合瑞士地铁数据验证所提模型的优越性.结果表明,增强离散选择模型中效用函数的非线性表达,可以提高模型预测性能,但仍然不如机器学习模型;采用遗传算法和贝叶斯优化后的极限梯度提升机模型,在出行选择预测准确率、召回率和F1分数均高于传统的线性或非线性效用函数多项式Logit模型以及普通随机森林和极限梯度提升机;采用遗传算法优化的极限梯度提升机模型预测准确性最高,为0.781,优于基于多次网格搜索的常规模型;采用遗传算法优化超参数比多次网格搜索的方式训练时间降低了81.4%;不同出行方式的成本和时间是影响选择的重要因素,火车和汽车对于时间的敏感性更高,瑞士地铁对于成本的敏感性更高.

外文摘要：To address the challenges of low predict accuracy,complex hyperparameter optimization,and limited model interpretability in conventional travel mode choice models and machine learning models,this paper introduces the genetic algorithm and Bayesian optimization for hyperparameter optimization of the extreme gradient boosting machine model(XGBoost).Additionally,the SHAP(SHapley Additive exPlanations)model is integrated to visualize the nonlinear relationship between travel mode attributes and individual characteristics in the choice probability.The proposed model is trained using 5-fold cross-validation to prevent overfitting and is evaluated using Swissmetro dataset to demonstrate its superiority.The results indicate that enhancing the nonlinear representation of the utility function in discrete choice models improves model prediction performance,yet falls short compared to machine learning models.The optimized XGBoost model,employing genetic algorithm and Bayesian optimization,outperforms conventional multinomial Logit models with linear or nonlinear utility functions,as well as standard random forest and non-optimized XGBoost models in terms of accuracy,recall,and F1 score for travel choice predictions.The XGBoost model optimized by genetic algorithm exhibits the highest prediction accuracy of 0.781,surpassing models based on conventional multiple grid search.Moreover,hyperparameter optimization using genetic algorithm reduces training time by 81.4%compared to multiple grid search.Furthermore,the study reveals that the cost and time associated with different travel modes significantly influence the choice preferences,with trains and cars being more sensitive to time while the Swiss metro is more sensitive to cost.

外文关键词：

urban trafficindividual travel predictionhyperparameter optimizationtravel mode choiceexplainable machine learning

作者：

李晓东、曹克让、匡海波

展开 >

作者单位：

大连海事大学,综合交通运输协同创新中心,辽宁大连 116026

关键词：

城市交通个体出行预测超参数优化出行方式选择可解释机器学习

出版年：

2024

DOI：

10.16097/j.cnki.1009-6744.2024.06.014

交通运输系统工程与信息

中国系统工程学会

交通运输系统工程与信息

CSTPCD北大核心

影响因子：0.664

ISSN：1009-6744

年,卷(期)：2024.24(6)