Singular Value Decomposition Method Based on Domestic Heterogeneous Platforms
With the development of high computing power applications,such as deep learning,heterogeneous computing is gradually becoming an important direction for parallel computing.Domestic heterogeneous platforms have been developed rapidly in recent years.Therefore,customizing and developing adaptive algorithms and software for domestic platform architectures,have acquired great significance.Singular Value Decomposition(SVD)is a powerful technique used in linear algebra libraries for processing general matrices,with applications in many fields,such as scientific computing,artificial intelligence,and signal processing.However,the performance of the SVD algorithm in the available library of a domestic accelerator is far inferior to that of NVIDIA,posing a challenge for the efficient porting of related applications.To this end,a matrix diagonalization method,mySVD,is proposed for domestic accelerators by adjusting the algorithmic flow to reduce the thread startup and memory access overhead.Computationally intensive tasks are unloaded to accelerators and a divide and conquer algorithm is designed for domestic heterogeneous platforms,whereby a multi-stream parallel task singular vector matrix generation method is proposed through CPU+accelerator.Finally,an efficient transplant optimization scheme is developed for singular value algorithms.The experimental results show that this scheme achieves the highest performance of 9.8 times that of the existing commercial closed-source linear algebra library MKL and 5.5 times that of the existing Open-Source heterogeneous computing linear algebra library MAGMA at different matrix scales.Finally,the proposed algorithm is applied to image processing and compared across platforms using MATLAB and GPU linear algebra library of NVIDIA,CUSOLVER.The algorithm demonstrates an increase in speed and generates images highly similar to the original.
parallel computingheterogeneous computingSingular Value Decomposition(SVD)domestic platformimage processing