One Acceleration Strategy for Operator Generation Based on TVM
With the rapid development of Artificial Intelligence(AI),the continuous emergence of new operators and underlying hardware has increased the workload associated with the development and maintenance of operator libraries.Relying solely on manual optimization to improve the performance and efficiency of AI models can result in bottlenecks.The TVM deep learning compiler alleviates the burden of manual optimization through automated code generation.However,it also suffers from long search times.To address this issue,this study proposes two optimization strategies for Ansor,an automated code generation framework for TVM.The first strategy introduces a new cost model based on a gradient boosting algorithm,whereas the second strategy involves pruning the scheduling space based on predefined rules.The two optimization strategies aim to accelerate the automated code generation process of TVM,enabling quick deployment and implementation of models and providing more efficient solutions for the application of AI technology.The experimental results show that by applying the optimized cost model,the tuning time of the model on the x86 CPU platform can be reduced by 30%to 35%without losing inference time.Simultaneously,the performance of the optimized operator can be improved by up to 22%,thereby reducing the tuning time of the model on the Deep Computing Unit(DCU)platform by approximately 20%.Simultaneously,the average performance of the optimized operator can be improved by 5.7%.In addition,a pruning strategy based on predefined rules can effectively improve the convergence speed of the cost model,and the inference time of the model can be increased by 7.4%under the original optimal number of iterations.
deep learning compilercost modelgradient boosting algorithmpruning strategyautomatic tuning