天气预报模型WRF中复杂Stencil性能优化
Performance Optimization of Complex Stencil in Weather Forecast Model WRF
邸健强 1袁良 2张云泉 2张思佳3
作者信息
- 1. 中国科学院计算技术研究所高性能计算机研究中心 北京 100190;大连海洋大学信息工程学院 辽宁大连 116023
- 2. 中国科学院计算技术研究所高性能计算机研究中心 北京 100190
- 3. 大连海洋大学信息工程学院 辽宁大连 116023
- 折叠
摘要
天气研究与预报模式(WRF)是一种应用广泛的中尺度数值天气预报系统,在大气研究和业务预报领域发挥着重要作用.Stencil计算是科学工程应用中一类常见的嵌套循环计算模式,WRF中对大气动力学和热力学方程的数值求解引出了大量空间网格上的复杂Stencil计算,存在多维度、多变量、物理模型边界特殊性、物理和动力学过程的复杂性等模型特征.文中深入剖析了 WRF中典型的Stencil计算模式,识别抽象出典型Stencil循环中存在的"中间变量"概念,围绕其设计实现了 3种优化方案,即中间变量计算合并、中间变量降维存储以及中间变量提取,有效提高了数据局部性,改善了数据重用率和空间复用率,降低了冗余计算和访存开销.结果表明,经优化方案重构的WRF 4.2典型Stencil热点函数在Intel CPU和Hygon CPU上均可获得良好的性能加速,最高加速比达21.3%和17.8%.
Abstract
The weather research and forecasting model(WRF)is a widely used mesoscale numerical weather forecasting system that plays an important role in the fields of atmospheric research and meteorological operational forecasting.Stencil computation is a common nested loop pattern in scientific and engineering applications.WRF performs a large number of complex stencil com-putation on spatial grids to solve numerical equations of atmospheric dynamics and thermodynamics.The stencils in WRF are fea-tured by multi-dimensionality,multi-variables,particularity of physical model boundaries,and complexity of physical and dynamic processes.This study analyzes the typical stencil pattern in WRF,identifies and abstracts the concept of"intermediate variable",and implements three optimization schemes,namely,intermediate variable computation merging,intermediate variable dimensio-nality reduction storage,and intermediate variables extraction.The optimization schemes effectively improve the data locality,in-crease data reuse and spatial reuse rates,and reduces redundant computing and memory access overhead.The results show that the WRF 4.2 typical hotspot functions achieve significant performance improvements on both Intel CPU and Hygon CPU,with the highest speedup ratios of 21.3%and 17.8%respectively.
关键词
WRF/Stencil计算/中间变量/优化方案/数据局部性/热点函数/性能加速Key words
WRF/Stencil computation/Intermediate variable/Optimization scheme/Data locality/Hotspot function/Performance improvement引用本文复制引用
基金项目
国家自然科学基金(61972376)
国家自然科学基金(62072431)
国家自然科学基金(62032023)
华为公司项目(TC20220914048)
出版年
2024