干旱区科学2023,Vol.15Issue(2) :191-204.

Estimation of soil organic matter in the Ogan-Kuqa River Oasis,Northwest China,based on visible and near-infrared spectroscopy and machine learning

ZHOU Qian DING Jianli GE Xiangyu Li Ke ZHANG Zipeng GU Yongsheng
干旱区科学2023,Vol.15Issue(2) :191-204.

Estimation of soil organic matter in the Ogan-Kuqa River Oasis,Northwest China,based on visible and near-infrared spectroscopy and machine learning

ZHOU Qian 1DING Jianli 1GE Xiangyu 1Li Ke 1ZHANG Zipeng 1GU Yongsheng1
扫码查看

作者信息

  • 1. College of Geography and Remote Sensing Science,Xinjiang University,Urumqi 830046,China;Xinjiang Key Laboratory of Oasis Ecology,Xinjiang University,Urumqi 830046,China;Key Laboratory of Smart City and Environment Modelling of Higher Education Institute,Xinjiang University,Urumqi 830046,China
  • 折叠

Abstract

Visible and near-infrared(vis-NIR)spectroscopy technique allows for fast and efficient determination of soil organic matter(SOM).However,a prior requirement for the vis-NIR spectroscopy technique to predict SOM is the effective removal of redundant information.Therefore,this study aims to select three wavelength selection strategies for obtaining the spectral response characteristics of SOM.The SOM content and spectral information of 110 soil samples from the Ogan-Kuqa River Oasis were measured under laboratory conditions in July 2017.Pearson correlation analysis was introduced to preselect spectral wavelengths from the preprocessed spectra that passed the 0.01 level significance test.The successive projection algorithm(SPA),competitive adaptive reweighted sampling(CARS),and Boruta algorithm were used to detect the optimal variables from the preselected wavelengths.Finally,partial least squares regression(PLSR)and random forest(RF)models combined with the optimal wavelengths were applied to develop a quantitative estimation model of the SOM content.The results demonstrate that the optimal variables selected were mainly located near the range of spectral absorption features(i.e.,1400.0,1900.0,and 2200.0 nm),and the CARS and Boruta algorithm also selected a few visible wavelengths located in the range of 480.0-510.0 nm.Both models can achieve a more satisfactory prediction of the SOM content,and the RF model had better accuracy than the PLSR model.The SOM content prediction model established by Boruta algorithm combined with the RF model performed best with 23 variables and the model achieved the coefficient of determination(R2)of 0.78 and the residual prediction deviation(RPD)of 2.38.The Boruta algorithm effectively removed redundant information and optimized the optimal wavelengths to improve the prediction accuracy of the estimated SOM content.Therefore,combining vis-NIR spectroscopy with machine learning to estimate SOM content is an important method to improve the accuracy of SOM prediction in arid land.

Key words

soil organic matter content/vis-NIR spectroscopy/random forest/Boruta algorithm/machine learning

引用本文复制引用

基金项目

新疆维吾尔自治区自然科学基金重点项目(2021D01D06)

国家自然科学基金(41961059)

出版年

2023
干旱区科学
中国科学院新疆生态与地理研究所,科学出版社

干旱区科学

CSTPCDCSCD北大核心
影响因子:1.743
ISSN:1674-6767
参考文献量1
段落导航相关论文