Vector Shuffling Based Parallel Radix-4 FFT Algorithm for Fixed Point Data on Vector DSP
The Fast Fourier Transform(FFT)algorithm of fixed-point data can reduce the hardware requirements within a reasonable accuracy range but obtain faster computing speed.Based on the hardware characteristics of high performance vector Digital Signal Processors(DSPs),this paper constructs an efficient instruction level parallel processing algorithm of Radix-4 complex FFT algorithm.This algorithm considers the calculation process of the Radix-4 complex FFT algorithm and the characteristics of butterfly units,and it fully integrates SIMD calculation,vector shuffling,indexing DMA and other techniques with the transformation process of Radix-4 complex FFT.This algorithm effectively controls the data block movement between memory and in-chip cache during the computing process and improves the utilization rate of SIMD processing unit.In this paper,an experimental study is conducted on the FT-M7002DSK platform of the YHFT-M7002 processor,which has independent intellectual property rights.Result shows that,compared with the performance obtained by CCS simulator for the corresponding TMS320C6678 library function,the average performance of our algorithm is 3.79 times faster than the former.