Research Progress of Large Language Model Acceleration Based on Computer Architecture
Large language model is the most active research direction in the field of artificial intelligence,but the dependence of large language model on resources has seriously reduced the application of large language model in various industries,especially in the training and deployment of end-to-side mobile devices and small devices.The lightweight acceleration of large language model is an urgent problem to be solved.From the perspective of computer architecture,this paper analyzes the common methods of large language model acceleration in seven aspects:processor,input-output communication transmission,memory,video memory,operating system,algorithm and compilation,and the analysis is comprehensive and involves the software and hardware.This paper provides a way to build lightweight large language model by systematic analysis.
large language modellightweightcomputing powerprecision calculationalgorithm framework