GPU-based Algorithm Optimization for Streaming Module of Lattice Boltzmann Method
The Lattice Boltzmann Method(LBM)is a Computational Fluid Dynamics(CFD)method based on a mesoscopic simulation scale.A large number of discrete lattice points suitable for parallelism are set during the calculation.Several arithmetic logic units in a Graphics Processing Unit(GPU)are suitable for large-scale parallel computing.The design of a GPU-based LBM parallel algorithm can improve the computational efficiency of the algorithm.However,the calculation of each lattice point in the streaming module of the LBM algorithm requires communication with other lattice points that have strong data dependence.In this study,a GPU-based optimization strategy for an LBM streaming module is proposed.First,the implementation logic of the migration part is analyzed in detail,and a three-dimensional model is discretized into several two-dimensional models according to the velocity component through model dimension reduction,which reduces the complexity of the model.Second,the data differences in the lattice points before and after the streaming module calculation are analyzed,the communication rules of the streaming module are determined through data positioning,and the data exchange modes between the lattice points are classified.The discrete two-dimensional model is thereafter divided into regions using a classified exchange mode,and a new data communication mode is designed.Finally,the influence of data dependence is successfully eliminated and the streaming module is completely parallel.The parallel algorithm is tested,and an acceleration ratio of 1.92 times is achieved under 1.3×108 grids,which shows that the algorithm has a good parallel effect.Meanwhile,compared with an algorithm that does not parallelize the streaming module,the optimization strategy in this study can improve the parallel computing efficiency of the algorithm by 30%.
High Performance Computing(HPC)Lattice Boltzmann Method(LBM)Graphics Processing Unit(GPU)parallel optimizationdata rearrangement