首页|基于SHAP的可解释机器学习的滑坡易发性评价模型

基于SHAP的可解释机器学习的滑坡易发性评价模型

扫码查看
机器学习在构建滑坡易发性评价模型中因其训练复杂且预测结果难以解释而发展受限.通过SHAP(SHapley Additive exPlanations)结合机器学习模型揭示各影响因子对滑坡发育的影响,增强模型可信度与可解释性.以三峡库区忠县为研究区,通过随机森林、XGBoost(eXtreme Gradient Boosting)以及深度随机森林机器学习算法结合贝叶斯优化算法分别构建滑坡易发性评价模型;利用混淆矩阵及受试者工作特征曲线开展评价精度验证;基于4种分级方法得到滑坡易发性区划图;通过SHAP分析影响滑坡发育的主导因子.结果表明,优化后的XGBoost模型受试者工作特征曲线下面积(AUC)值(0.817)高于随机森林的AUC值(0.803)和深度随机森林的AUC值(0.806);不同分级方式下的易发性区划图分布差异很大,其中基于相等间隔法和XG-Boost 模型的分级效果相对更好,极高-高易发区主要集中在研究区的东南部和东北部,特别是长江及其支流两岸.SHAP图揭示各主导因子不同特征值对滑坡发育有明显差异,高程和距河流距离是研究区滑坡发育的主要影响因子,对滑坡发育贡献显著.本研究的XGBoost模型具有较高的预测精度,模型可解释性强,为滑坡灾害的精准防治提供科学依据.
Explainable machine learning models for landslide susceptibility mapping based on SHAP
In landslide susceptibility mapping evaluation,the development of machine learning has been restricted due to its complex training and the difficult in interpreting its prediction results.This study aims to enhance model credibility and interpretability by revealing the impact of evaluation factors on landslide development through the combination of SHapley Additive exPlanations(SHAP)and machine learning models.Taking Zhongxian County in the Three Gorges Reservoir Area as the study area,landslide susceptibility models and maps are constructed using Random Forest,eXtreme Gradient Boosting(XGBoost),and Deep Random Forest algorithms.The optimal parameters are selected using a Bayesian optimization algorithm.The evaluation accuracy is validated through the utilization of confusion matrices and receiver operating characteristic(ROC)curves.Four different classification methods are used to assess landslide susceptibility.Finally,the SHAP algorithm is applied to interpret the influence of different factors on landslide development.The results show that the optimized XGBoost model achieves an area under the ROC curve(AUC)value of 0.817,higher than the AUC values of the Random Forest(0.803)and Deep Random Forest(0.806)models.The distribution of susceptibility mapping varies greatly under the different classification methods,with the best results being obtained with the equal-interval method and XGBoost model.Based on this combined model,we find that the extremely high and high susceptibility zones are concentrated in the southeastern and northeastern parts of the study area,particularly along the banks of the Yangtze River and its tributaries.Through the SHAP analysis,we are able to clarify the influence of different characteristic values of each dominant factor on the development of landslides.We find that elevation and the distance to rivers are the primary factors controlling landslide development in the study area.Overall,the XGBoost model used in this study exhibits high prediction accuracy and strong interpretability,providing a scientific basis for precise prevention and control of landslide disasters.

XGBoostDeep Random ForestSHapley Additive exPlanationsThree Gorges Reservoir Arealandslide susceptibility mapping

崔婷婷、安雪莲、孙德亮、陈东升、朱有晨

展开 >

重庆对外经贸学院数学与计算机学院,重庆 401520

重庆师范大学GIS应用研究重庆市高校重点实验室,重庆 401331

XGBoost 深度随机森林 SHAP 三峡库区 滑坡易发性评价

2025

成都理工大学学报(自然科学版)
成都理工大学

成都理工大学学报(自然科学版)

北大核心
影响因子:1.596
ISSN:1671-9727
年,卷(期):2025.52(1)
  • 1