黑龙江交通科技2024,Vol.47Issue(1) :150-154.

交通大数据质量控制架构与清洗方法

Big Data Quality Control Architecture and Cleaning Method in Transportation Field

席加熠 詹璐 沈凯龙 沈湘萍
黑龙江交通科技2024,Vol.47Issue(1) :150-154.

交通大数据质量控制架构与清洗方法

Big Data Quality Control Architecture and Cleaning Method in Transportation Field

席加熠 1詹璐 2沈凯龙 3沈湘萍2
扫码查看

作者信息

  • 1. 中交信捷科技有限公司,北京 100011
  • 2. 北京北大千方科技有限公司,北京 100085
  • 3. 天翼云科技有限公司,北京 100007
  • 折叠

摘要

针对综合交通运输大数据中心数据质量不足、难以支撑应用的问题,基于ISACA数据治理思想提出交通数据质量控制架构与清洗治理方法,从规范性、完整性、一致性、准确性、时效性和可访问性六个维度明确数据质量校验规则,给出人工与自动化结合的数据清洗方法论.经成都市的轨道交通刷卡数据验证,共识别出刷卡数据表中票卡类型不规范、数据传输不及时等11 项数据质量问题,并通过技术与管理双重手段校正90.9%的数据质量问题,后续传入的数据均符合要求.

Abstract

In order to improve the data quality of integrated transportation big data center and support data application,a quality control architecture is proposed which based on ISACA data governance ideas.The quality control architecture includes the guide of data quali-ty check,dirty data identification method and data cleaning governance method.The data quality check guide summarized from six di-mensions:data standardization,data integrity,data consistency,data accuracy,data timeliness and data accessibility,and there is a methodology which composed of manual and automatic to identify and clean dirty data.Through the verification of rail transit card swi-ping data in Chengdu,11 data quality problems such as non-standard ticket card types and untimely data transmission in the card swi-ping data table were identified by consensus.90.9% data quality problems were corrected by means of both technology and manage-ment,and the subsequent incoming data met the requirements.

关键词

智能交通/数据质量/校验规则/清洗方法

Key words

intelligent transportation/data quality/validation rules/cleaning method

引用本文复制引用

出版年

2024
黑龙江交通科技
黑龙江省交通科学研究所,黑龙江省交通科技情报总站

黑龙江交通科技

影响因子:0.977
ISSN:1008-3383
参考文献量3
段落导航相关论文