首页|铁路数据分布式湖仓一体架构分析与设计

铁路数据分布式湖仓一体架构分析与设计

扫码查看
科学合理的数据资源分类方法和行之有效的数据湖架构体系,可以支撑起铁路全业务数据的高效存储、组织和利用,并进一步支持并优化各项运营业务。文章首先对现有数据湖架构进行简要分析,确定选用湖仓一体的概念,将铁路数据以主题进行分类以适应业务处理需求;其次设计了铁路数据分布式湖仓一体架构,阐述了路局级子湖仓一体与国铁集团总湖仓一体的架构与功能,以及两者之间的数据流转过程;最后分析了所设计架构的特性与存在的问题,为进一步构建有效的铁路运营数据湖提供了参考。
Analysis and Design of Railway Data Distributed Lake Warehouse Integrated Architecture
A scientific and reasonable data resource classification method and an effective data lake architecture system can support the efficient storage,organization,and utilization of railway full business data,and further support and optimize various operational businesses.This paper first provides a brief analysis of the existing data lake architecture,determining the concept of integrated lake and warehouse,and categorizing railway data by theme to meet business processing needs;secondly,a railway data distributed lake warehouse integrated architecture is designed,elaborating on the architecture and functions of the sub lake warehouses at the railway bureau level and the overall lake warehouses of China Railway Group,as well as the data flow process between the two;finally,the characteristics and existing problems of the designed architecture are analyzed,providing a reference for further constructing an effective railway operation data lake.

railway big datadata governancedata lakeintegrated lake and warehousedistributed architecture

李国华、邹丹、李海军、孙思齐、王建强

展开 >

中国铁道科学研究院集团有限公司 电子计算技术研究所,北京 100081

兰州交通大学 交通运输学院,甘肃 兰州 730070

铁路大数据 数据治理 数据湖 湖仓一体 分布式架构

中国国家铁路集团有限公司科技研究开发计划课题

P2021S012

2024

现代信息科技
广东省电子学会

现代信息科技

ISSN:2096-4706
年,卷(期):2024.8(1)
  • 23