Lightning Nowcasting Method for Tibet Shannan City Based on FY-4A Satellite Data and Random Forest Algorithm
Due to the lack of radar data in the Tibet Plateau region,lightning nowcasting has met certain difficulties.In order to solve this problem,the FY-4A satellite,the convection index of ERA5 reanalysis data and lightning location information are being used to propose 18 prediction factors in accordance with the mechanism of formation and development of lightning.A lightning nowcasting model is being established based on the random forest algorithm for Tibet's Shannan region.By statistically analysing the probability density distribution of each prediction factor in the lightning and non-lightning samples,and comparing with the feature importance from the random forest model,it is demonstrated that the statistical analysis results fit well with the conclusion from the important feature.Therefore the proposed prediction factors have a relatively clear physical meaning and the established model is of high reliability.The results also reveal that the difference between the infrared brightness temperature and land surface temperature,the lightning location data of the past 10 minutes,the K-index and the infrared brightness temperature of channels 11 and 12 have significant contributions to the lightning nowcasting model.Analysing the prediction ability of the random forest model at different development stages of lightning,through two cases,the results show that the model can effectively predict the lightning location for the next 30 minutes.The lightning forecasting location is in good consistency with the observation data,especially at the stage of strong convective development.However,at the early stages of the convective development and dissipation,due to the model limitations in predicting the evolution of convection,the model has a relatively high false alarm ratio(FAR)and miss alarm ratio(MAR),so the prediction effect is relatively poor.To find the best predictable time scale,the lightning nowcasting models have been trained separately for the next 10,20 and 30 minutes by using the random forest algorithm.The validation results show that with the increase of predictable time,the FAR of the random forest model gradually decreases,and the MAR gradually increases.Hence,the model for the next 20 minutes has the highest critical success index(CSI),and the overall prediction effect is the best.In order to further test the forecast effects of the models,the traditional optical flow extrapolation method has been selected for a contrast test.The results show that the random forest models perform better than the optical flow extrapolation method for all three predictable time scales.These three random forest models all have a better probability of detection(POD),CSI,and a relatively lower FAR.As a result,the CSI of the random forest model has reached above 0.70.
lightning nowcastingrandom forestFY-4A satelliteconvective index