Research on data governance system based on transportation big data
The integration and development of big data technology and transportation industry have generated a large amount of transportation data.A data governance architecture based on transportation big data is proposed for massive data.The architecture in-cludes data collection,data storage and data governance modules.Data collection uses data collection tools to collect traffic business systems and message queues to collect real-time data such as bus GPS.Data storage uses HDFS distributed file system for storage and Hive to build data warehouse.Further analyze the methods of metadata management,master data management,data sharing and ex-change and data quality management in transportation big data governance,and adopt Kerberos for security management,and propose a MD5 to construct CRC and generating graph algorithm to generate data pedigree graph for the full lifecycle management of the data.The effect of big data governance in the X Transportation Bureau shows that data governance can effectively improve the bureau's informa-tionization level.
Big dataData governanceData share and exchangeData life cycle management