The existing polyhedral compilation frameworks(such as Pluto,LLVM/Poly and GCC/Graphite)use fixed block sizes when performing loop tiling,which cannot fully utilize the caching characteristics of different hardware,resulting in significant performance differences.In response to this issue,many loop tiling algorithms based on multi-level caching and data locality have emerged,but these algorithms often only optimize specific loop programs or lack comprehensive consideration,and are not suitable for transplantation into general compilers.This paper proposes a tile size selection algorithm based on data locality,which not on-ly considers the impact of cache replacement strategy,but also considers the load balancing problem in multi-core environments.The algorithm is implemented based on the Polly module in LLVM,and some test cases from Pluto and PolyBench are selected for single core and multi-core testing.The experimental results show that compared to the default partitioning method of LLVM/Polly,the proposed algorithm achieves an average acceleration ratio of 2.03 and 2.05 on two hardware platforms in a single core environment,and has good parallel scalability in a multi-core environment.
Data localityPolyhedral modelLoop tilingTile sizeLoad balancing