高吞吐量基础设施检测数据实时处理技术
Real-time Processing Technologies for High-throughput Infrastructure Detection Data
危倩 1杨森 1赵一馨 1姚莉1
作者信息
- 1. 中国铁道科学研究院集团有限公司 基础设施检测研究所,北京 100081
- 折叠
摘要
为满足基于5G传输的车地无线传输系统中检测数据实时处理需求,提出融合分布式流处理平台Apache Kafka(简称Kafka)和分布式处理引擎Apache Flink(简称Flink)的高吞吐量实时检测数据处理方法.基于Kafka构建实时检测数据仓库,实现对实时检测数据的有序快速读取与存储;以Flink作为实时计算引擎,实现高吞吐量数据流下对检测数据的实时处理与分析.为缩短实时检测数据关联台账数据的时间,提出基于时空索引的台账关联方法.通过对综合检测列车采集的实时检测数据的处理,证明了所提方法能确保高吞吐量数据流下数据处理稳定高效进行,且实时检测数据平均处理时间在1 min以内,能满足地面工作人员对综合检测列车检测设备状态的远程实时监控需求.
Abstract
To meet the real-time processing requirements of detection data in 5G based vehicle ground wireless transmission systems,a high-throughput real-time detection data processing method was proposed,which integrates the distributed stream processing platform Apache Kafka(Kafka)and the distributed processing engine Apache Flink(Flink).A real-time data warehouse was built based on Kafka to achieve orderly and fast reading and storage of real-time detection data.Using Flink as a real-time computing engine to achieve real-time processing and analysis of detection data under high-throughput data streams.To shorten the time for real-time detection data association with ledger data,a ledger association method based on spatiotemporal index was proposed.Through the processing of real-time detection data collected from comprehensive inspection trains,it has been proven that the proposed method can ensure stable and efficient data processing under high-throughput data streams,and the average processing time of real-time detection data is within 1 min,which can meet the remote real-time monitoring needs of ground staff for the status of comprehensive inspection train detection equipment.
关键词
高速铁路/实时数据处理架构/试验研究/基础设施检测数据/Kafka/FlinkKey words
high speed railway/real-time data processing architecture/experimental study/infrastructure testing data/Kafka/Flink引用本文复制引用
出版年
2024