针对高速公路行驶路段全程封闭、车辆行驶速度高、一旦发生交通事故往往造成不可估量的损失等问题,提出一种基于随机森林(Random Forest,RF)算法筛选指标与极端梯度提升树(eX-treme Gradient Boosting,XGBoost)算法结合的高速公路事故风险研判方法.首先,通过筛选高速公路事故路段的私家车行驶轨迹数据,建立4种不同时空条件(事故上游30 km及事故前30 min、事故上游10 km及事故前15 min、事故上游10 km及事故前10 min和事故上游10 km及事故前5 min)下的事故风险研判数据基座.其次,建立了随机森林和极端梯度提升树(Random Forest and XG-Boost,RF-XGBoost)组合的事故风险研判方法,在对高速公路行驶车辆的各类运行指标进行筛选的基础上,对高速公路事故风险进行研判.最后,采用模型准确率、精确率、召回率、平衡F分数(bal-anced F Score,F1)、曲线下面积值(Area Under Curve,AUC)5个指标评价算法效果.研究结果表明:RF-XGBoost组合算法在事故风险研判上优于决策树(Decision Tree,DT)、支持向量机(Sup-port Vector Machine,SVM)和传统的XGBoost算法;RF-XGBoost算法较传统XGBoost算法的平均准确率提升了11.1%,平均精确率提升了8.9%,平均召回率提升了7.625%.在事故上游10 km及事故前10 min的时空条件下,算法的准确率可达80%,综合研判效果最好.研究结果可为高速公路私家车的事故风险研判和动态预警提供理论和方法支撑.
Trajectory data-driven risk assessment of freeway accidents
To address the challenges posed by fully enclosed freeway segments,high vehicle speeds and the substantial damage caused by traffic accidents,this study proposes a freeway accident risk as-sessment method that integrates the Random Forest(RF)algorithm for feature selection with the eX-treme Gradient Boosting(XGBoost)algorithm.First,by filtering private vehicle trajectory data from freeway accident segments,a data foundation for accident risk assessment is established under four dif-ferent spatiotemporal conditions(30 km upstream and 30 minutes before the accident,10 km upstream and 15 minutes before the accident,10 km upstream and 10 minutes before the accident,and 10 km upstream and 5 minutes before the accident).Next,a combined accident risk assessment method based on the RF and XGBoost is constructed.It evaluates accident risk after selecting various operational in-dicators for vehicles on the freeway.Finally,the algorithm's performance is assessed using five met-rics:accuracy,precision,recall,balanced F Score(F1),and Area Under Curve(AUC).Results indi-cate that the RF-XGBoost combination algorithm outperforms the Decision Tree(DT),Support Vec-tor Machine(SVM),and traditional XGBoost algorithms in accident risk assessment.Compared to the traditional XGBoost algorithm,the average accuracy of the RF-XGBoost algorithm is increased by 11.1%,the average precision is increased by 8.9%,and the average recall rate is increased by 7.625%.Under the spatiotemporal condition of 10 km upstream and 10 minutes before the accident,the algorithm achieves an accuracy of 80%,demonstrating optimal overall assessment performance.These findings provide theoretical and methodological support for freeway accident risk assessment and dynamic warnings for private vehicles.