首页|基于图正则化多视角函数型矩阵填充的空气质量数据修复

基于图正则化多视角函数型矩阵填充的空气质量数据修复

扫码查看
由于传感器故障、数据传输等原因,收集到的空气质量数据往往面临着稀疏性和不完整性的挑战。提出了一种基于图正则化的多视角函数型矩阵填充方法(GRMFMC),以有效修复和重建空气质量数据的缺失部分。该方法通过引入图正则化策略,充分考虑了各污染物内样本信息的高阶邻域关系,减少了信息损失;并且利用Hilbert-Schmidt独立性准则(HSIC)提取不同污染物之间的互补信息,进而提高插补精度;同时,结合函数型数据分析,将不同时间点的空气质量数据视为连续函数,利用其内在的平滑性和相关性实现高精度的数据插值。真实空气质量数据集上的模拟插补和实证应用结果表明,相较于其他典型插补方法,GRMFMC方法的插补误差RMSE、NRMSE分别降低了 56%~99%、46%~98%以及51%~99%、40%~98%,显示出更好的插补效果,且在不同缺失率和污染物种类下皆表现稳健,具有良好的泛化能力和实际应用价值。
Air quality data restoration based on graph regularization multi-view functional matrix completion
Due to issues such as sensor malfunctions and data transmission,the collected air quality data often encounter challenges of sparsity and incompleteness.In order to effectively repair and reconstruct the missing parts of air quality data,a Graph Regularized Multi-view Functional Matrix Completion method(GRMFMC)is proposed.Firstly,this innovative method introduces a graph regularization approach that thoroughly takes into account the high-order neighborhood relationship within each pollutant's sample set,reducing information loss.Secondly,it utilizes the Hilbert-Schmidt Independence Criterion(HSIC)to discern complementary information among various pollutants,thereby improving imputation accuracy.Additionally,by integrating the principles of functional data analysis,the GRMFMC technique treats temporal air quality data as continuous functions,capitalizing on their inherent smoothness and correlation for high-precision data interpolation.Simulation imputations and empirical applications on real air quality datasets both demonstrate that the GRMFMC exhibits superior interpolation performance.In simulation imputations,the GRMFMC method reduces the imputation error by 56%~99%in RMSE and 46%~98%in NRMSE;in empirical applications,it reduces the error by 51%~99%in RMSE and 40%~98%in NRMSE.Furthermore,the GRMFMC method shows consistent robustness across different missing rate and pollutant categories,confirming its potential for generalization capability and practical value in professional settings.

functional data analysismulti-view learninggraph regularizationair quality datamatrix completiondata restoration

高海燕、马文娟

展开 >

兰州财经大学统计与数据科学学院,甘肃兰州 730020

甘肃省数字经济与社会计算科学重点实验室,甘肃兰州 730020

函数型数据分析 多视角学习 图正则化 空气质量数据 矩阵填充 数据修复

国家社会科学基金项目甘肃省自然科学基金项目兰州财经大学科研项目

19XTJ00223JRRA1186Lzufe2023C-005

2024

中国环境科学
中国环境科学学会

中国环境科学

CSTPCDCHSSCD北大核心
影响因子:2.174
ISSN:1000-6923
年,卷(期):2024.44(10)