A Novel Interleaved Mapping Data Layout in Scratch Pad Memory
Modern computers adhere to the classical linear data layout mode,which enables efficient row-major access to Two-Dimensional(2D)matrices stored in the row-major order.However,this complicates the efficient execution of column-major data access,thus resulting in unsatisfactory spatial locality.The efficiency of column-major data access is typically improved by pre-transposing the original matrix and concentrating the complexity of column-major access into a single matrix transposition operation.Nevertheless,matrix transposition introduces additional data transfer operations and requires additional memory to store the transposed matrix.To achieve equally efficient access to row-major and column-major data without introducing additional overhead,a novel Interleaved Mapping(IM)data layout is proposed.Without altering the internal structure of the Scratch Pad Memory(SPM),this layout is implemented by adding two new components—a Cyclic Shift Unit and a Decoder Unit—at the Input and Output(I/O)interfaces of the SPM.Additionally,customized memory access instructions are developed,thus enabling programmers to leverage the data layout fully via these instructions.Experimental results show that the SPM utilizing the IM data layout increases the speedup by 1.4 times while incurring 1.73%additional area overhead.
matrix transpositionSingle Instruction Multiple Data(SIMD)Scratch Pad Memory(SPM)data layoutStatic Random Access Memory(SRAM)