Matrix Multiplication Acceleration Based on NEON Parallel Computing Architecture
The demands for signal processing on computers are constantly increasing.With the rapid de-velopment of ARM architecture and the rapid rise of domestic processors based on ARM architecture,it is of great significance to investigate the general signal processing acceleration technology for the ARM plat-form.By analyzing the ARMv8 architecture and NEON technology,the FT-2000/4(ARMv8 architec-ture)is adopted as an experimental platform to examine the acceleration of the representative DSP library on the ARMv8 architecture.The matrix operation is taken as an example,in which a NEON-based gener-al matrix multiplication algorithm is proposed.Experimental results show that the acceleration of the pro-posed algorithm for the ARM architecture is significant.It can provide technical support for building a comprehensive and efficient general signal processing library for the ARM architecture.
general signal processingARMv8FT-2000/4NEONmatrix multiplication