首页|基于对流层检测仪和臭氧检测仪的我国近地面NO2浓度的估算对比与优化

基于对流层检测仪和臭氧检测仪的我国近地面NO2浓度的估算对比与优化

扫码查看
由于二氧化氮(NO2)在大气中的存活寿命较短,卫星遥感反演的对流层NO2柱浓度与近地面NO2浓度关系密切。欧洲航天局(ESA)S5P卫星的对流层检测仪(TROPOMI)载荷提供了目前最高空间分辨率的对流层NO2数据,其在近地面NO2浓度估算方面的潜在优势亟待检验。为此,本文采用极限梯度提升(XGBoost)算法和4年(2018-2021年)的TROPOMI/臭氧检测仪(OMI)数据估算了我国近地面NO2浓度并开展了对比性分析。结果表明:1)TROPOMI的估算结果在精度和空间覆盖度两个方面,均明显高于OMI的结果;2)OMI数据由于自身空间分辨率的限制,无法和TROPOMI一样识别出NO2浓度高值区附近的空间分布细节,导致其估算结果存在更严重的高估或低估。进一步,针对机器学习方法估算近地面NO2普遍存在高值低估的现象,通过集成模型进行优化,得到了更优的结果(R2=0。85,slope为0。89)。该研究结果有利于促进卫星遥感在近地面NO2浓度估算与暴露评估领域的深入应用。
Comparison and Optimization of Ground-Level NO2 Concentration Estimation in China Based on TROPOMI and OMI
Objective Nitrogen dioxide(NO2)in the atmosphere has an important impact on air quality and climate change,and ground-level NO2 will directly affect human health.China is one of the regions with high concentrations of NO2 in the world.Long-term surface NO2 concentration data has been provided by China Environmental Monitoring Station since 2013.In addition,the satellite data can make up for the lack of coverage of ground stations.Compared with the previous ozone detector(OMI)sensor,tropospheric detector(TROPOMI)has higher data coverage and spatial resolution,but its potential for ground-level NO2estimation needs to be proved,and the underestimation of the estimation model predicting high-value samples needs to be optimized.The purpose of this paper is to use machine learning algorithms to estimate ground-level NO2 concentration in China based on satellite observation data and obtain 0.05-degree NO2 concentration raster data from 2014 to 2021.On this basis,a systematic comparative study is carried out on the difference in the estimation results of TROPOMI and OMI sensor observations,and an optimization model is established to optimize the underestimation of the conventional machine learning model in the high-value area.Methods The dataset in this paper contains the observations of ground-level NO2 concentration from ground stations,the tropospheric NO2 column concentration provided by OMI and TROPOMI which come from European Space Agency and Google Earth Engine,and auxiliary data that contains meteorological data of ERA5,population data,surface elevation data,and land use data.Data preprocessing includes assigning station data to the nearest grid and resampling data with different spatial resolutions to 0.05 degrees.The dataset and the algorithm are used to build a model with the algorithm named XGBoost,which is optimized on the basis of GBDT,so as to have higher prediction accuracy.The features of the model are selected by variance inflation factor(VIF)and analyzed by shapley additive explanation(SHAP)value.By comparing the temporal and spatial coverage of TROPOMI and OMI sensor observation data and comparing satellite imagery and estimation results for a specific area,we study the difference between these two data in estimating ground-level NO2 concentration.In addition,the estimation model is optimized by establishing an ensemble model that contains a classification model and a high-value prediction model.Results and Discussions Uneven spatial distribution of ground stations will cause the estimation results to present the same value in the area with fewer ground stations,so the accuracy of estimation will be poor(Fig.2).The VIF of features that connect with geographic information is much higher than the threshold,which is supposed to be 10,and the VIF of surface pressure and DSM is out of the threshold(Fig.3).After comparing the correlation coefficient between the two and the surface observations and the update frequency of the two,we decide to remove the surface elevation and retain the surface pressure.Feature importance of the OMI data computed by SHAP value is 6.09,which is much more than those of others(Fig.3).According to the Beeswarm from SHAP value of each feature,it can be found that when the observed value of OMI is higher,it will have a positive effect on the predicted value,or in other words,when the observed value of OMI is higher,it will lead to an increase in the predicted result,and when it is lower,it will make prediction results decrease(Fig.3).The temporal and spatial resolution of TROPOMI data is higher than that of OMI(Fig.4),and the machine learning accuracy evaluation index of the estimation result is better than that of OMI(Fig.5).By comparing satellite observations and estimating specific regions with ground-based observations,it is found that TROPOMI data with higher spatial resolution can identify changes from spatial gradient that fails to be identified in OMI data,resulting in more accurate estimates(Fig.6).By classifying high-value samples first and then building an additional high-value sample model for estimation,the optimized estimation model successfully increases the slope of the scatter diagram of the estimation results from 0.79 to 0.89,and the R2 increases from 0.79 to 0.85(Fig.7).It can also be seen from the image that the estimation results of the optimized model are closer to the ground observations(Fig.8).Conclusions 1)There is serious multicollinearity in the latitude and longitude information in the prediction model variables,which will affect the quality of model estimation;2)The data coverage of TROPOMI is higher than that of OMI,and the estimation result is better than that of OMI,ten-fold cross-validation(R2:0.79 VS 0.75,slope:0.79 VS 0.74);3)The high spatial resolution of TROPOMI can identify high or low NO2 near-surface areas that cannot be identified by OMI;4)By establishing an integrated model and selecting high-value samples for separate processing,the prediction accuracy can be significantly improved;R2 is increased from 0.79 to 0.85,and the slope of the fitting line is increased from 0.79 to 0.89.

remote sensing and sensorsestimation of ground-level NO2 concentrationextreme gradient boosting algorithmfeature analysisoptimization of estimation

周文远、秦凯、何秦、王璐瑶、罗锦洪、谢卧龙

展开 >

中国矿业大学环境与测绘学院,江苏徐州 221116

西安地球环境创新研究院,陕西西安 710061

山西省生态环境规划和技术研究院,山西太原 030000

遥感与传感器 近地面二氧化氮浓度估算 极限梯度提升算法 特征分析 估算优化

国家自然科学基金

42375125

2024

光学学报
中国光学学会 中国科学院上海光学精密机械研究所

光学学报

CSTPCD北大核心
影响因子:1.931
ISSN:0253-2239
年,卷(期):2024.44(6)
  • 56