首页|面向机器学习型区域滑坡易发性评价的训练样本采样方法

面向机器学习型区域滑坡易发性评价的训练样本采样方法

扫码查看
训练样本在基于机器学习的区域滑坡易发性评价中具有重要作用,训练样本通常是由滑坡(正样本)和非滑坡(负样本)组成,由采样方法采集得到.然而,现有正样本采样方法均没有度量所采集正样本的可信度,使得所采集训练样本可靠性得不到保证,制约了机器学习的区域滑坡易发性评价效果.针对这一问题,本文提出滑坡正样本原型采样方法(PBS),该方法利用某点与滑坡正样本原型的地理环境相似度和不相似度分别度量正样本与负样本的可信度,基于互斥法设置可信度阈值采集训练样本.以甘肃省油房沟流域为研究区,将PBS与已有代表性采样方法分别对油房沟流域构建基于逻辑回归、支持向量机和随机森林的滑坡易发性推测模型,对比有可信度和无可信度样本下的滑坡易发性评价效果.结果发现,正样本和负样本可信度与滑坡易发性评价效果分别呈现"波动上升"与"正相关"的特点,PBS方法在基于3种机器学习模型的滑坡易发性评价的验证精度(Accuracy)和接收者操作特征曲线下面积(AUC)值比已有代表性采样方法分别至少提高了 14.7%和14%,且标准差均较小,表明本文所提出方法是有效的.
A new training data sampling method for machine learning-based landslide susceptibility mapping
Training samples play an important role in machine learning-based regional landslide susceptibility evaluation.These samples consist of both landslide(positive)and nonlandslide(negative)samples collected through various sampling methods.However,existing methods for positive sample collection do not measure the reliability of the collected samples,leading to uncertainty in terms of reliability.To address this issue,this paper presents a landslide prototype sampling method(PBS).This method uses the geographical similarity and dissimilarity between a certain point and the landslide positive sample prototype to measure the reliability of positive and negative samples,respectively.A reliability threshold is set based on a mutual exclusion method to collect training samples.The Youfanggou Basin in Gansu province was chosen as the research area.The PBS and existing representative sampling methods were used to construct landslide susceptibility prediction models based on logistic regression,support vector machines,and random forests for the Youfanggou Basin.The evaluation effects of landslide susceptibility were compared between the reliable and nonreliable samples.The reliability of the positive and negative samples exhibited a"fluctuating increase"and"positive correlation",respectively,in the evaluation of landslide susceptibility.The PBS method improved the accuracy and area under the receiver operating characteristic curve(AUC)of the landslide susceptibility evaluation based on the three machine learning models by at least 14.7%and 14%,respectively,compared to the existing representative sampling methods,and the standard deviation was small,which indicates that the method proposed in this article is effective.

regional landslide susceptibility assessmenttraining sample samplingmachine learning modelsreliability measurement

洪浩源、王德生、朱阿兴

展开 >

南京信息工程大学地理科学学院,南京 210044

郑州师范学院地理与旅游学院,郑州 450044

南京师范大学地理科学学院,南京 210023

威斯康星大学麦迪逊分校地理系,美国麦迪逊53706

展开 >

区域滑坡易发性评价 训练样本采样 机器学习模型 可信度度量

国家自然科学基金项目

41871300

2024

地理学报
中国地理学会 中国科学院地理科学与资源研究所

地理学报

CSTPCDCSSCICHSSCD北大核心
影响因子:3.3
ISSN:0375-5444
年,卷(期):2024.79(7)
  • 16