首页|基于混合精度的分布式GMRES算法优化

基于混合精度的分布式GMRES算法优化

扫码查看
广义最小残差法(Generalized Minimum Residual,GMRES)是一种求解稀疏线性系统的迭代方法,被广泛应用于科学与工程计算等领域.数据量的爆炸式增长,使得GMRES算法求解的问题规模快速膨胀.为了支持大规模问题的求解,研究人员提出了面向集群的分布式GMRES算法.然而在现有的大多数集群中,节点间的网络性能仍与节点内的GPU高速互联网络存在较大差距,限制了分布式GMRES算法的性能.针对GPU集群上的分布式GMRES算法,提出了一种基于混合精度的加速求解方法,使用低精度浮点表示,显著降低了通信过程的时间开销.此外,提出了一种数据传输的精度调控算法,动态自适应调整传输数据的精度,以保证迭代算法最佳的求解效果.实验结果表明,所提基于混合精度的优化方法可实现平均2.4倍的加速比,结合其他优化方法后可实现平均7.6倍的加速比.
Optimizing Distributed GMRES Algorithm with Mixed Precision
The generalized minimum residual(GMRES)method is an iterative method for solving sparse linear systems.It is broadly used in many areas like scientific and engineering computing.The exponential data growth makes the scale of problems solved by the GMRES algorithm expand rapidly.To support the solving of large-scale problems,researchers have implemented distributed GMRES algorithm on clusters.However,the current inter-node network still significantly lags behind intra-node fa-brics in terms of both bandwidth and latency,which greatly limits the performance of the distributed GMRES algorithm.This pa-per proposes a mixed-precision approach for optimizing the GMRES algorithm on GPU clusters,where the data transferred is re-presented in a low-precision format,the network traffic during inter-GPU communication is significantly reduced.In addition,this paper proposes a balancing algorithm that dynamically adjusts the precision of the data transferred to achieve the satisfied resi-dual.Experimental results show that the proposed method achieves an average speedup of 2.4×,and a further average speedup of 7.6× when combined with other optimizations.

Generalized minimum residualMixed precisionGPU clusterDistributed system

郭帅哲、高建花、计卫星

展开 >

北京理工大学计算机学院 北京 100081

广义最小残差法 混合精度 GPU集群 分布式系统

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(9)