基于交通大数据的数据治理体系研究
Research on data governance system based on transportation big data
黄学平 1苏依婷 2张永良1
作者信息
- 1. 中通服咨询设计研究院有限公司,江苏南京 210005
- 2. 山东农业大学,山东泰安 271018
- 折叠
摘要
大数据技术和交通行业的融合发展产生了大量的交通数据.针对海量数据提出了一种基于交通大数据的数据治理架构.该架构包括数据采集、数据存储和数据治理模块.数据采集采用数据采集工具对交通业务系统进行采集,采用消息队列对公交GPS等实时数据进行采集.数据存储采用HDFS分布式文件系统进行存储,并采用Hive构建数据仓库.进一步分析交通大数据治理中的元数据管理、主数据管理、数据共享交换与数据质量管理的方法,并采用Kerberos进行安全管理,提出了一种MD5构建CRC和生成图算法生成数据血缘关系图进行数据的全生命周期管理.X交通局的大数据治理效果表明,数据治理能有效提高该局的信息化水平.
Abstract
The integration and development of big data technology and transportation industry have generated a large amount of transportation data.A data governance architecture based on transportation big data is proposed for massive data.The architecture in-cludes data collection,data storage and data governance modules.Data collection uses data collection tools to collect traffic business systems and message queues to collect real-time data such as bus GPS.Data storage uses HDFS distributed file system for storage and Hive to build data warehouse.Further analyze the methods of metadata management,master data management,data sharing and ex-change and data quality management in transportation big data governance,and adopt Kerberos for security management,and propose a MD5 to construct CRC and generating graph algorithm to generate data pedigree graph for the full lifecycle management of the data.The effect of big data governance in the X Transportation Bureau shows that data governance can effectively improve the bureau's informa-tionization level.
关键词
大数据/数据治理/数据共享交换/数据全生命周期管理Key words
Big data/Data governance/Data share and exchange/Data life cycle management引用本文复制引用
出版年
2024