DRM:A GPU-parallel SpMV storage format based on iterative merge strategy
Sparse matrix vector multiplication(SpMV)is of great significance in the solution of line-ar systems,and is one of the core problems in scientific computing and engineering practice.Its per-formance highly depends on the non-zero distribution of sparse matrices.Sparse diagonal matrices are a special type of sparse matrices,whose non-zero elements are densely arranged in the form of diagonals.For sparse diagonal matrices,scholars have proposed various storage formats on the GPU platform,which have improved SpMV performance,but still suffer from zero padding and load imbalance issues.To address these issues,a DRM(Divide-Rearrange&Merge)storage format is proposed.This format uses matrix partitioning strategies based on fixed threshold values and matrix reconstruction strategies based on iterative merging to achieve sparse zero padding and load balancing between blocks.Experi-mental results show that on the NVIDIA® Tesla® V100 platform,compared to DIA,HDC,HDIA,and DIA-Adaptive formats,the time performance is accelerated by 20.76,1.94,1.13,and 2.26 times,respectively,and the floating point performance is improved by 1.54,5.28,1.13,and 1.94 times,respectively.