首页|基于随机森林的高速公路变路径偷逃费行为识别

基于随机森林的高速公路变路径偷逃费行为识别

扫码查看
为提高高速公路变路径偷逃费行为识别效率,针对改变路径偷逃费行为进行研究,建立一种基于随机森林的高速公路变路径偷逃费行为识别模型,能够有效识别该类偷逃费行为,协助高速公路相关管理部门追缴偷逃费用.首先,分析原始收费数据,筛选出与本次研究相关的字段,经过运算得到12个模型可输入的初始特征;然后,通过计算各个特征的方差膨胀因子(variance inflation factor,VIF)和容忍度(tolerance,TOL)值来剔除存在共线性的特征,并利用Boruta算法筛选高重要性特征("行驶方向是否一致""入出站是否一致""通行时间"和"最小费额里程");其次,使用SMOTETomek综合采样技术来平衡数据集;再其次,利用网格搜索法对随机森林进行超参数调优;最后,利用所建立模型进行训练和识别,并与基准模型的识别效果进行对比.结果表明:所建立模型能够更好地对高速公路变路径偷逃费行为进行识别,Macro-F1分数达到了 0.966,优于极限梯度提升(extreme gradient boost,XGBoost)(0.943 1)、决策树(decision tree,DT)(0.956 3)和梯度提升决策树(gradient boosting decision trees,GBDT)(0.938 2),能够为运营管理部门稽查该类偷逃费车辆提供参考.
Identification of Fee Evasion Behavior in Expressway Changing Path Based on Random Forest
In order to improve the efficiency of identifying toll evasion behavior by changing paths on highways,toll evasion behavior was studied by changing paths.A model for identifying toll evasion behavior by changing paths on highways based on random forests was established,which can effectively identify such behavior of toll evasion and assist relevant management departments of highways in recovering evaded fees.Firstly,the original toll data were analyzed to filter out the fields related to this study,and the 12 initial features that can be inputted into the model were obtained after arithmetic.Secondly,the features with covariance were eliminated by calculating the variance inflation factor(VIF)and tolerance(TOL)values of each feature,and the Boruta algorithm was used to filter out the high-importance features("whether the driving direction is consistent""whether the entry and exit stations are consistent""travel time"and"minimum fare mileage").Thirdly,the data set was balanced using the SMOTETomek integrated sampling technique.Then,the grid search method was used to tune the hyperparameters of the random forest.Finally,the model built was utilized for training and recognition,and the recognition effect was compared with that of the benchmark model.The results show that the model developed can better recognize the toll evasion behavior by changing paths on highways,and the Macro-F1 score reaches 0.966,which is better than the extreme gradient boost(XGBoost)(0.943 1),decision tree(DT)(0.956 3)and gradient boosting decision trees(GBDT)(0.938 2),and it can provide reference for operation management departments to inspect such toll evasion vehicles.

random forest(RF)toll evasion by changing pathBoruta algorithmdata imbalance processing

邹杰、曹宏禄、李平安、黄诗音、赵建东

展开 >

中公华通(北京)科技发展有限公司,北京 100088

北京交通大学交通运输学院,北京 100044

北京交通大学系统科学学院,北京 100044

随机森林(RF) 改变路径偷逃费 Boruta算法 数据不平衡处理

2024

科学技术与工程
中国技术经济学会

科学技术与工程

CSTPCD北大核心
影响因子:0.338
ISSN:1671-1815
年,卷(期):2024.24(36)