首页|多源异构数据融合关键技术与政务大数据治理体系

多源异构数据融合关键技术与政务大数据治理体系

扫码查看
随着信息技术的飞速发展,各级政府和大型企业掌握的数据量正在以指数级别增长.然而,数据来源多样会导致格式差异,数据质量参差不齐会影响应用效果,数据分散管理会弱化关联汇集,数据形态异构会造成语义鸿沟.在此背景下,多源异构数据融合负责将来源不同的多模态数据进行有效整合,完成数据互补与关联,进而实现信息增强.目前,大多数已有研究的关注重点集中在大数据治理流程与多模态深度学习,很少有工作研究讨论完整的多源异构数据融合技术框架.因此,在综述关键技术的基础上,文中提出了 一整套涵盖"数据引接-数据清洗-数据集成-数据融合"全过程的多源异构数据融合关键技术框架,并对各个环节需要解决的问题与重点任务进行介绍.然后,通过一个政务应用实例场景,给出了政务大数据治理体系的设计,以解决政务数据来源广泛、质量参差不齐、管理分散、形态异构的问题,并进一步阐述了多源异构数据融合的重要价值.最后总结全文并展望未来.
Multi-source Heterogeneous Data Fusion Technologies and Government Big Data Governance System
With the rapid development of information technology,the data held by governments and enterprises are growing expo-nentially.However,the multi-source of data will lead to different formats,the low quality of data will affect the application re-sults,the decentralized management of data will weaken integration services,and the heterogeneous modal of data will cause se-mantic gaps.Under this background,multi-source heterogeneous data fusion is responsible for effectively integrating multi-modal data from different sources,and then achieve information complementarity and data association,thus realizing information en-hancement.At present,most studies focus on big data governance process and multi-modal deep learning,there are few works dis-cuss integral multi-source heterogeneous data fusion framework.Therefore,based on reviewing the key technologies,this paper proposes the key technologies framework of multi-source heterogeneous data fusion that covering the processes of"data collec-tion-data cleaning-data integration-data fusion",and introduces the problems and tasks of each stage.Then,through an example of the government affairs application,the data governance system for government data is designed,which further explains the signi-ficance of multi-source heterogeneous data fusion.In the end,this paper is summarized and future work is prospected.

Multi-source heterogeneous dataMulti-modal data fusionData governance technologyBig data of government af-fairsBig data governance process

闫佳和、李红辉、马英、刘真、张大林、江周娴、段宇航

展开 >

北京交通大学计算机与信息技术学院 北京 100044

国家信息中心 北京 100045

北京交通大学软件学院 北京 100044

多源异构数据 多模态数据融合 数据治理技术 政务大数据 大数据治理流程

国家重点研发计划

2019YFB2102500

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(2)
  • 81