Explainable machine learning models for landslide susceptibility mapping based on SHAP
In landslide susceptibility mapping evaluation,the development of machine learning has been restricted due to its complex training and the difficult in interpreting its prediction results.This study aims to enhance model credibility and interpretability by revealing the impact of evaluation factors on landslide development through the combination of SHapley Additive exPlanations(SHAP)and machine learning models.Taking Zhongxian County in the Three Gorges Reservoir Area as the study area,landslide susceptibility models and maps are constructed using Random Forest,eXtreme Gradient Boosting(XGBoost),and Deep Random Forest algorithms.The optimal parameters are selected using a Bayesian optimization algorithm.The evaluation accuracy is validated through the utilization of confusion matrices and receiver operating characteristic(ROC)curves.Four different classification methods are used to assess landslide susceptibility.Finally,the SHAP algorithm is applied to interpret the influence of different factors on landslide development.The results show that the optimized XGBoost model achieves an area under the ROC curve(AUC)value of 0.817,higher than the AUC values of the Random Forest(0.803)and Deep Random Forest(0.806)models.The distribution of susceptibility mapping varies greatly under the different classification methods,with the best results being obtained with the equal-interval method and XGBoost model.Based on this combined model,we find that the extremely high and high susceptibility zones are concentrated in the southeastern and northeastern parts of the study area,particularly along the banks of the Yangtze River and its tributaries.Through the SHAP analysis,we are able to clarify the influence of different characteristic values of each dominant factor on the development of landslides.We find that elevation and the distance to rivers are the primary factors controlling landslide development in the study area.Overall,the XGBoost model used in this study exhibits high prediction accuracy and strong interpretability,providing a scientific basis for precise prevention and control of landslide disasters.
XGBoostDeep Random ForestSHapley Additive exPlanationsThree Gorges Reservoir Arealandslide susceptibility mapping