首页|地面自动气象站数据流式处理设计与实现

地面自动气象站数据流式处理设计与实现

The Design and Implementation of Stream Processing for Data of Ground Automatic Weather Stations

扫码查看
针对观测密度和频次日益增加的海量地面自动气象站数据,在气象大数据云平台(天擎)中设计了基于Storm的实时流式处理,利用大规模并行处理的优势提高地面自动气象站数据的处理时效.在流式处理中,设计处理拓扑直接解码标准格式的数据消息;消息确认采用手工确认的方式,将数据解码组件锚定数据接入组件,实现每条数据的可靠处理;数据解码时进行字节校验和时间检查等,过滤异常数据;应用批量加定时的发送策略,解决海量监控信息发送气象综合业务实时监控系统(天镜)的问题;集群部署时保留部分剩余资源,有效应对单节点异常.应用效果表明:国家气象站小时数据的服务时效由全国综合气象信息共享系统(CIMISS)的175 s提高至天擎的78 s,约6×104个区域气象站小时数据的服务时效由CIMISS的5 min提高至天警的2 min,实况分析系统将数据源切换至天擎后,相同时间检索可获取的站点数量较CIMISS增加1倍.2021年12月基于Storm的流式处理与天擎一同在国省业务化运行,实现了长期稳定运行,为MICAPS4、SWAN2.0、实况分析系统等用户提供高效稳定的地面自动气象站数据.
To process the high-density and high-frequency mass data generated by ground automatic weather sta-tions,a real-time stream processing system based on Storm is designed and implemented in the Meteoro-logical Big Data Cloud Platform(Tianqing).It leverages the advantages of large-scale parallel computing to enhance processing speed.For BUFR messages,a Storm topology is designed to process the standard-ized BUFR format data transmitted by RabbitMQ directly on the service,reducing the intermediate steps from transmission to processing of observations.In the spout design,the manual confirmation mode of RabbitMQ messages is adopted to ensure that each message is effectively processed.In the decoding process,bolt is anchored to the spout using message indentification(ID)to ensure reliable processing of each message.Format and time checks are performed during data decoding to filter out abnormal data.A batch timing monitoring strategy is applied to address the issue of data ingestion loss caused by port occu-pancy during extensive monitoring data transmission.A startup strategy with a configurable number of spout and bolt is designed for quick optimization and adjustment based on system resources.During cluster deployment,some resources are reserved to enable automatic task migration without disrupting business operations in case of node corruption within the cluster.System design involves automatically reconnecting message queues and databases to enhance system stability and enable self-healing capabilities.Application results show that the service efficiency of 2442 national stations has decreased from 175 s with CIMISS to 78 s with Tianqing.Additionally,the service efficiency of hourly data from over 60000 regional stations has decreased from 5 min with CIMISS to 2 min with Tianqing.After switching the data source of the ART(analysis of real time)system to Tianqing,the number of stations that can be retrieved simultane-ously is doubled compared to CIMISS.It can effectively improve the quality of ART live products while keeping other conditions unchanged.By implementing specialized stream processing,it can effectively han-dle various business scenarios where data access process of the provincial Tianqing ground automatic weather stations differ from that of other provinces.It enables the provincial Tianqing to quickly process nationwide data from ground automatic weather stations.In December 2021,Storm-based stream process-ing is implemented in the national and provincial meteorological information departments alongside Tian-qing.It has been running smoothly over two years,delivering reliable ground automatic weather station data to users,including MICAPS4,SWAN2.0,ART systems and others.

meteorological big data cloud platformground automatic weather stationStormRabbitMQstream processingBUFR

肖卫青、薛蕾、刘振、罗兵、王颖、张来恩、郭萍、霍庆、韩书丽、何文春

展开 >

国家气象信息中心,北京 100081

气象大数据云平台 地面自动气象站 Storm RabbitMQ 流式处理 BUFR

中国气象局气象雷达数据共享平台数据实时传输系统建设项目中国气象局气象信息化系统工程感知网数据交换平台分系统数据解码软件项目中国气象局创新发展专项

ZQC-J19187ZQC-H22320CXFZ2021Z007

2024

应用气象学报
中国气象科学研究院 国家气象中心 国家卫星气象中心 国家气候中心 国家气象信息中心 中国气象局气象探测中心

应用气象学报

CSTPCD北大核心
影响因子:1.459
ISSN:1001-7313
年,卷(期):2024.35(3)
  • 40