Stacking Ensemble Learning Modeling and Forecasting of Maize Yield Based on Meteorological Factors
[Objective]In the context of intensified global climate change and frequent meteorological disasters,exploring the significance of meteorological factors on maize yield and accurately predicting maize yield is crucial for enhancing agricultural production and field management.This paper aims to quantitatively analyze the importance of meteorological factors during various growth stages of maize on yield and to establish a highly accurate and reliable maize meteorological yield stacking ensemble learning estimation model for yield prediction.[Method]Using the HP filter method and moving average method,trend yield models for various counties were determined,and county-level meteorological yields were isolated.Three ensemble learning methods(light gradient boosting machine(LightGBM),Bagging,and Stacking)were employed.By analyzing daily meteorological data and maize yield data over 34 years from 596 county-level administrative regions and meteorological observation stations across 12 provinces in China,three maize meteorological yield prediction models based on different ensemble learning frameworks(LightGBM,Bagging,and Stacking)were established.[Result]The HP filter method as the trend yield model was mainly applicable in the regions of Shaanxi,Henan,Jiangsu,and Anhui.Compared to the HP filter method,more counties were suitable for the moving average method,with most counties having the R2 distribution above 0.8.Based on a 5-year sliding forecast and model accuracy evaluation indicators,the mean absolute percentage error(MAPE)for the three models on maize yield was below 6%.The Stacking model achieved a MAPE of 4.60%,indicating high prediction accuracy and strong generalizability.The results demonstrate that the maize meteorological yield stack-integrated learning prediction model has higher accuracy and stronger robustness.It effectively utilizes the characteristics and advantages of each base learner to improve prediction accuracy,making it the optimal model for predicting maize yield based on meteorological factors.Furthermore,a quantitative analysis of the impact of 27 meteorological factors during the maize growth stages in 12 provinces,using the random forest feature importance score,is of reference value for crop monitoring and field management.[Conclusion]The three ensemble learning methods,especially the stack-integrated learning model(Stacking),can accurately reflect the spatiotemporal distribution changes in maize yield.The stack-integrated learning model for maize yield based on meteorological factors provides a new method for field management and accurate prediction of maize yield.