Rapid correction of near real-time FY-4A retrieval based on ensemble machine learning
Satellite remote sensing retrieval is an important way to solve the problem of obtaining near-real-time high-resolution precipitation information.Fengyun-4A(FY-4A)is outfitted with the Advanced Geosynchronous Radiation Imager(AGRI),which boasts world-leading performance.The dual scanning mirrors of AGRI enable precise 2-D pointing,allowing for minute-level regional scans—a groundbreaking achievement.This advanced instrument can capture high-frequency images of the Earth's cloud cover in more than 14 spectral bands.It can generate the official FY-4A REGC(Regional Precipitation Estimation Near-real-time Product for China),which is one of the precipitation estimates information that China can independently obtain from satellite remote retrieval.However,the accuracy of FY-4A REGC still lags behind that of IMERG-Early,the counterpart product of the Global Precipitation Measurement(GPM).Currently,the prevailing approach for correcting satellite-derived precipitation products involves constructing linear prior relationship models between historical satellite rainfall estimates and corresponding ground truth measurements,typically obtained from rain gauges or radar systems.When new observational data become available,this relationship is utilized to derive corrected precipitation values.However,linear models struggle to precisely capture the intricate relationship between satellite rainfall estimates and ground truth measurements.We have observed that ensemble learning methods offer nonlinear models that exhibit advantages such as faster training,reduced data requirements,and robust model stability.In this study,a correction method for official FY-4A precipitation estimates is dynamically constructed using an ensemble machine learning method(LightGBM)with FY-4A REGC as the model input and IMERG-Early as the training calibration for the mainland China region.The revised FY-4A precipitation product(FY-4A Adj)was compared with the original FY-4A REGC using the CMPA automatic gauge observations as the ground reference.The Correlation Coefficient(CC),Root Mean Square Error(RMSE),and relative bias(Bias)of FY-4A Adj were found to be improved significantly compared with those of FY-4A REGC.The revised algorithm effectively reduced the significant overestimation of the original FY-4A REGC in southern China.Our investigation revealed that choosing the correct order for training information significantly enhances model accuracy,with this study opting for training order 221.In practical applications,the ensemlde learning model can continually optimize its model parameters and performance by dynamically adjusting to the latest training data in real time.We also conducted a comparative analysis of two classes of methods employing ensemble learning,namely,bagging and boosting.Our findings indicate that the Random Forest method performs better when working with limited data volumes,while LightGBM is the recommended choice for large datasets.In conclusion,the correction method based on ensemble machine learning proposed in this paper can quickly and effectively improve the near-real-time Precipitation estimates of FY-4A REGC.This method provides guidance for producing high-quality satellite precipitation products based on FY-4A.