OPTIMIZED REALIZATION OF LAPLACIAN ALGORITHM FOR FT-M7002
In order to give full play to the platform advantages of domestic FT high-performance processor,we optimize the Laplace algorithm in parallel for it.On the basis of data moving,DMA data transfer mechanism was used to solve the problems of array matrix transpose,data access discontinuity and data transfer time gap,so as to improve the performance of the program and fully explore the data level and instruction level parallelism of the program.The experimental results show that the optimized vectorization parallel algorithm achieves 2.02~2.55 times faster acceleration than before.Compared with TMS320C6678 processor,the efficiency of FT optimized algorithm can reach 1.48~2.56 times.
High-performance processorLaplace algorithmParallel optimizationDMA data transmission