Machine Learning-based Prediction Model for Atmospheric NO2 Concen-tration
Traditional NO2 monitoring technique faces challenges such as delay in response time.It is crucial to predict the atmospheric NO2 levels for informing environmental policy decisions and enhancing air quality.The atmospheric NO2 levels can be affected by various factors including regional meteorological conditions,industrial pollution emissions,and socio-economic development,leading to notable regional disparities in the NO2 pollution.In recent years,machine learning techniques have been generally utilized for predicting pollutant levels,with the XGBoost(eXtreme Gradient Boosting)algorithm standing out for its excellent ability to analyze data relationships.This study gathered annual data on atmospheric NO2 levels,meteorological conditions,industrial emissions,and socio-economic factors of 11 districts in Dalian City from 2011 to 2022.By employing a time-sliding strategy in conjunction with the XGBoost algorithm,a spatially heterogeneous model was developed to predict the NO2 concentrations.The coefficient of determination(R2)of the model for the prediction results reached 0.611,which shows that the model demonstrated has good prediction performance and generalization ability.Multiple factors of concern were analyzed by using SHAP(SHapley Additive exPlanations),and the results revealed that pollution emission of ammonia nitrogen,retail sales of social consumer goods,and pollution emission of nitrogen oxides were positively associated with the NO2 concentration.