数值计算与计算机应用2024,Vol.45Issue(2) :115-135.DOI:10.12288/szjs.s2023-0916

面向结构动力学计算的撕裂有限元方法异构并行优化

HETEROGENEOUS PARALLEL OPTIMIZATION OF TEARING FINITE ELEMENT METHODS FOR STRUCTURAL DYNAMICS CALCULATIONS

聂宁明 姚柯寒 曾艳 冯仰德 王珏 李顺德 张纪林 万健 林克豪 高岳 王彦棡 王宗国
数值计算与计算机应用2024,Vol.45Issue(2) :115-135.DOI:10.12288/szjs.s2023-0916

面向结构动力学计算的撕裂有限元方法异构并行优化

HETEROGENEOUS PARALLEL OPTIMIZATION OF TEARING FINITE ELEMENT METHODS FOR STRUCTURAL DYNAMICS CALCULATIONS

聂宁明 1姚柯寒 2曾艳 3冯仰德 2王珏 2李顺德 1张纪林 3万健 3林克豪 3高岳 4王彦棡 1王宗国1
扫码查看

作者信息

  • 1. 中国科学院计算机网络信息中心,北京 100190
  • 2. 中国科学院计算机网络信息中心,北京 100190;杭州电子科技大学计算机学院,杭州 310018
  • 3. 杭州电子科技大学计算机学院,杭州 310018
  • 4. 中国原子能科学研究院,北京 102413
  • 折叠

摘要

本文结合大规模撕裂有限元方法和Newmark积分法,对结构动力学问题进行高精细的大规模并行求解.面向异构平台,设计了结点间和结点内的多级动静结合的负载均衡策略.在结点间,根据撕裂有限元方法划分子域边界特点,采用域边界平衡的图二分算法,均衡各个子域的计算量;在结点内,根据异构平台计算单元的性能差异,进行了计算负载的动态优化.针对核心计算模块批量矩阵向量乘进行多流并行优化,提升面向异构计算平台的利用率.本文优化已经集成到结构力学高性能数值模拟软件HARSA-feti中,实验采用真实反应堆核燃料组件的流致振动仿真作为算例,结果表明模拟性能提高了 71.3%以上,首次实现了百亿网格规模的全堆芯燃料棒组件的高精细模拟,相较于1000块GPU,16000块GPU的强、弱可扩展并行效率分别达到74.1%和81.1%.

Abstract

This paper adopts the Newmark integration method based on the large-scale tearing finite element method to perform high-precision large-scale parallel solving of structural dy-namic calculations.A multi-level load balancing strategy combining static and dynamic methods is designed for heterogeneous platforms.For inter-node computing,subdomain boundaries are partitioned based on the characteristics of the tearing finite element method,and a domain boundary balanced graph bipartition algorithm is used to balance the com-putation load of each subdomain.For intra-node computing,dynamic optimization of com-putation load is performed based on the performance differences of computing units on heterogeneous platforms.To improve the utilization rate of heterogeneous computing plat-forms,multi-stream parallel optimization is carried out for the core computing module's batch matrix-vector multiplication.The optimization in this paper has been integrated into the high-performance numerical simulation software for structural mechanics,HARSA-feti.The simulation performance is demonstrated using the flow-induced vibration simulation of a real reactor fuel component as an example.The results show that the simulation performance has increased by more than 71.3%,and the high-precision simulation of a billion-grid-scale full-core fuel rod component has been achieved for the first time.Compared with 1,000 GPUs,the strong and weak scalable parallel efficiency of 16,000 GPUs reached 74.1%and 81.1%,respectively.

关键词

结构动力学/大规模并行计算/负载均衡/异构计算/矩阵向量乘

Key words

Structural dynamics/Parallel computing on a large scale/Massively paral-lel/Load balancing/Matrix vector multiplication

引用本文复制引用

出版年

2024
数值计算与计算机应用
中国科学院数学与系统科学研究院

数值计算与计算机应用

影响因子:0.188
ISSN:1000-3266
段落导航相关论文