Analysis and Design of Railway Data Distributed Lake Warehouse Integrated Architecture
A scientific and reasonable data resource classification method and an effective data lake architecture system can support the efficient storage,organization,and utilization of railway full business data,and further support and optimize various operational businesses.This paper first provides a brief analysis of the existing data lake architecture,determining the concept of integrated lake and warehouse,and categorizing railway data by theme to meet business processing needs;secondly,a railway data distributed lake warehouse integrated architecture is designed,elaborating on the architecture and functions of the sub lake warehouses at the railway bureau level and the overall lake warehouses of China Railway Group,as well as the data flow process between the two;finally,the characteristics and existing problems of the designed architecture are analyzed,providing a reference for further constructing an effective railway operation data lake.
railway big datadata governancedata lakeintegrated lake and warehousedistributed architecture