Prediction of potential geographic distribution of Oncomelania hupensis in Yunnan Province using random forest and maximum entropy models
Objective To predict the potential geographic distribution of Oncomelania hupensis in Yunnan Province using random forest(RF)and maximum entropy(MaxEnt)models,so as to provide insights into O.hupensis surveillance and control in Yunnan Province.Methods The O.hupensis snail survey data in Yunnan Province from 2015 to 2016 were collected and con-verted into O.hupensis snail distribution site data.Data of 22 environmental variables in Yunnan Province were collected,includ-ing twelve climate variables(annual potential evapotranspiration,annual mean ground surface temperature,annual precipitation,annual mean air pressure,annual mean relative humidity,annual sunshine duration,annual mean air temperature,annual mean wind speed,≥0℃annual accumulated temperature,≥10℃annual accumulated temperature,aridity and index of moisture),eight geographical variables(normalized difference vegetation index,landform type,land use type,altitude,soil type,soil texture-clay content,soil texture-sand content and soil texture-silt content)and two population and economic variables(gross domestic product and population).Variables were screened with Pearson correlation test and variance inflation factor(VIF)test.The RF and MaxEnt models and the ensemble model were created using the biomod2 package of the software R 4.2.1,and the potential distribution of O.hupensis snails after 2016 was predicted in Yunnan Province.The predictive effects of models were evaluated through cross-validation and independent tests,and the area under the receiver operating characteristic curve(AUC),true skill statistics(TSS)and Kappastatistics were used for model evaluation.In addition,the importance of environmental variables was an-alyzed,the contribution of environmental variables output by the models with AUC values of>0.950 and TSS values of>0.850 were selected for normalization processing,and the importance percentage of environmental variables was obtained to analyze the importance of environmental variables.Results Data of 148 O.hupensis snail distribution sites and 15 environmental variables were included in training sets of RF and MaxEnt models,and both RF and MaxEnt models had high predictive performance,with both mean AUC values of>0.900 and all mean TSS values and Kappa values of>0.800,and significant differences in the AUC(t=19.862,P<0.05),TSS(t=10.140,P<0.05)and Kappa values(t=10.237,P<0.05)between two models.The AUC,TSS and Kappa values of the ensemble model were 0.996,0.954 and 0.920,respectively.Independent data verification showed that the AUC,TSS and Kappa values of the RF model and the ensemble model were all 1,which still showed high performance in un-known data modeling,and the MaxEnt model showed poor performance,with TSS and Kappavalues of 0 for 24%(24/100)of the mod-eling results.The modeling results of 79 RF models,38 MaxEnt models and their ensemble models with AUC values of>0.950 and TSS values of>0.850 were included in the evaluation of importance of environmental variables.The importance of annual sunshine duration(SSD)was 32.989%,37.847%and 46.315%in the RF model,the MaxEnt model and their ensemble model,while the importance of annual mean relative humidity(RHU)was 30.947%,15.921%and 28.121%,respectively.Important en-vironment variables were concentrated in modeling results of the RF model,dispersed in modeling results of the MaxEnt model,and most concentrated in modeling results of the ensemble model.The potential distribution of O.hupensis snails after 2016 was predicted to be relatively concentrated in Yunnan Province by the RF model and relatively large by the MaxEnt model,and the distribution of O.hupensis snails predicted by the ensemble model was mostly the joint distribution of O.hupensis snails predict-ed by RF and MaxEnt models.Conclusions Both RF and MaxEnt models are effective to predict the potential distribution of O.hupensis snails in Yunnan Province,which facilitates targeted O.hupensis snail control.
Oncomelania hupensisRandom forest modelMaximum entropy modelGeographical distributionPredictive per-formanceYunnan Province