首页|基于向量混洗和DMA传输的高效分组整序算法研究

基于向量混洗和DMA传输的高效分组整序算法研究

扫码查看
为提高快速傅里叶变换的处理速度,提升相关应用领域系统的性能,针对向量超长指令字(VLIW)架构处理器,提出一种适用于数据涉及的采样点数量为2的整数次幂的分组整序方法.该方法对输入数据按照一定规模分成若干个组,在每个分组内部进行混洗整序,以及通过直接存储器访问(DMA)传输将每组中的数据依次传输到结果数组,有效减少逆序数的计算需求和消除了单个数据寻址的要求.此外,针对硬件的"乒乓"存储功能提出了数据的向量混洗和DMA传输的并行处理方法,进一步提升分组整序算法的执行效率.通过在FT-M7002处理器上的具体算法实现和实验结果表明,该方法适用于向量VLIW架构处理器,运算结果正确,且有效地提高了整序的执行速度.
High-efficiency grouping realigning algorithm based on vector shuffling and DMA
Realignment is a key bottleneck of the performance of fast Fourier transform operation process.For the vector Very Long Instruction Word(VLIW)architecture processor,to accelerate the processing of fast Fourier transform and improve the performance of the application systems in related fields,a group realigning method suit-able for arbitrary integer power bases of 2 was proposed.This method included three aspects:dividing the input da-ta into several groups of certain scale,shuffling and realigning within each group,and transferring the data in each group to the destination array in turn through Direct Memory Access(DMA).The method had the advantage of ef-fectively reducing the calculation requirement of inverse sequence numbers and avoiding the requirement of finding index for each single data.Moreover,for architectures with"ping-pong"storage mechanism,a parallel method of shuffling and DMA was proposed based on this method.The specific algorithm implementation and experimental verification were carried out on the FT-M7002 processor.Experimental results showed that the method was not only correct,but also could effectively improve the execution speed of realigning in vector processors,which was very suitable for vector VLIW processors.

fast fourier transformbit reverserealignmentshuffle

李慧祥、张会福、胡勇华、张鑫、王书盈

展开 >

湖南科技大学计算机科学与工程学院,湖南 湘潭 411201

湖南科技大学服务计算与软件服务新技术湖南省重点实验室,湖南 湘潭 411201

快速傅里叶变换 位逆序 整序 混洗

湖南省教育厅科研资助项目湖南省教育厅科研资助项目湖南省自然科学基金资助项目

20B24219A1692023JJ50019

2024

计算机集成制造系统
中国兵器工业集团第210研究所

计算机集成制造系统

CSTPCD北大核心
影响因子:1.092
ISSN:1006-5911
年,卷(期):2024.30(7)