首页|基于XGBOOST-SHAP的地铁建成环境与站点出行距离的非线性关系研究

基于XGBOOST-SHAP的地铁建成环境与站点出行距离的非线性关系研究

扫码查看
相较于传统地铁客流量特征分析,地铁站点平均出行距离的研究可以更加精细化了解地铁网络客流流动性特征.为探究多重建成环境与站点平均出行距离之间的复杂关系,以西安市地铁系统为研究对象,从土地利用、兴趣点分布、周边交通建成环境、站点自身属性等方面构建11种建成环境指标,建立基于极端梯度提升的XGBOOST-SHAP归因分析架构的可解释性机器学习模型,以揭示两者之间的非线性关系.同时,将该模型拟合回归效果与梯度提升决策树(GBDT)及最小二乘回归(OLS)进行比较,以验证XGBOOST模型在拟合回归效果上的优势.结果表明:XGBOOST模型的R方、平均绝对误差(MAE)、均方误差(MSE)值分别为0.75、0.95、1.36,其拟合效果要优于GBDT与OLS模型.站点平均出行距离呈现出明显的环状分布的空间异质性.SHAP归因分析结果表明:距市中心距离特征贡献最大,路网密度、土地利用混合度、公交线路数量以及住宅数量对出行距离的贡献度也相对较高;POI香农熵指数、餐饮服务点对平均出行距离的正负反馈不明显;其余指标对平均出行距离的影响均呈现出正负反馈机制结合的趋势.研究结果对交通需求分析、线路容量优化、运营效果评估等提供了数据支撑,可有效提高地铁交通便利性,满足不同区域的出行需求并改善整个地铁系统的效率和可持续性.
Research on nonlinear relationship between subway built environment and travel distance of stations based on XGBOOST-SHAP
Compared to traditional analysis of passenger flow characteristics,the consideration of the average travel distance of metro stations leads to a more refined understanding of the passenger flow dynamics of networks.This study focused on the metro system of Xi'an City to explore the complex relationship between multiple built environment factors and the station-level average travel distance.Eleven built environment indicator systems,including land use,point of interest features,surrounding transportation infrastructure,and station attributes,were first developed.An interpretable machine learning model based on Extreme Gradient Boosting with SHAP attribution analysis framework(XGBOOST-SHAP)was established to reveal the nonlinear relationship between these factors.Additionally,the advantages of the XGBOOST model in regression fitting were verified by the comparison with Gradient Boosting Decision Trees(GBDT)and Ordinary Least Squares(OLS).The results show that the XGBOOST model achieves an R-squared value of 0.75,with a mean absolute error(MAE)of 0.95 and mean squared error(MSE)of 1.36,outperforming the GBDT and OLS models in terms of fitting performance.A clear circular distribution pattern can be found with the spatial heterogeneity of average travel distance.SHAP attribution analysis reveals that apart from the distance to the city center feature,other features such as road network density,land use mix,the number of bus routes,and residential count also contribute significantly to the travel distance.The influence of POI Shannon entropy index and food service points on average travel distance does not show clear positive or negative feedback.Other indicators demonstrate a combined positive and negative feedback mechanism on average travel distance.The research results,which are beneficial for transportation demand analysis,route capacity optimization,and operational effectiveness evaluation,can effectively improve the convenience of metro transportation,satisfy the needs of different regions,and enhance the efficiency and sustainability of the entire metro system.

metro stationsbuilt environmenttravel distanceXGBOOST modelSHAP attribution analysisnonlinear relationship

李培坤、陈旭梅、鲁文博、马嘉欣、刘屹、王昊

展开 >

北京交通大学 综合交通运输大数据应用技术交通运输行业重点实验室,北京 100044

东南大学 交通学院,江苏 南京 214135

重庆市市政设计研究院有限公司,重庆 400020

地铁站点 建成环境 出行距离 XGBOOST模型 SHAP归因分析 非线性关系

国家自然科学基金国家自然科学基金

7227102071871013

2024

铁道科学与工程学报
中南大学 中国铁道学会

铁道科学与工程学报

CSTPCD北大核心EI
影响因子:0.837
ISSN:1672-7029
年,卷(期):2024.21(4)
  • 20