An Ensemble Prediction Model for Traffic Accident Severity Based on SHAP Interpretation
This study aims to improve the efficiency of facility hazard identification following traffic accidents by constructing a SMOTE-optimized ensemble prediction model for traffic accident severity.The research investigates the relationship between various facility-related features and traffic accident severity using the publicly available US Accident dataset for training and validation.Initially,SMOTE was employed to address the class imbalance issue.Subsequently,an ensemble learning approach was implemented,utilizing logistic regression as the meta-learner and integrating Adaboost,LightGBM,and logistic regression as base learners with a weighted voting strategy to enhance predictive performance.The results demonstrate that the model achieved an accuracy,recall,and F1 score of 0.7817,outperforming individual models.Furthermore,SHAP values were applied to interpret the contributions of facility-related features.The analysis reveals that traffic signal systems serve as the primary measure to reduce accident severity,public transportation stations significantly influence accidents due to their high-density traffic,parking lots pose elevated risks associated with vehicle parking activities,and traffic calming facilities and signage contribute to improving road safety and driving experience.