首页|基于计算机体系结构的大模型加速研究进展

基于计算机体系结构的大模型加速研究进展

扫码查看
大模型是人工智能领域最为活跃的研究方向,但大模型对资源的依赖严重降低了大模型在各行业的落地应用,尤其在端侧移动设备和小型设备的训练和部署,大模型轻量化加速是亟待解决的问题.本文从计算机体系结构角度,对处理器、输入输出通信传输、内存、显存、操作系统、算法、编译7个方面,全方位、软硬贯通,分析了当前大模型加速常用的方法,提供了一种体系化分析建立大模型轻量化的思路.
Research Progress of Large Language Model Acceleration Based on Computer Architecture
Large language model is the most active research direction in the field of artificial intelligence,but the dependence of large language model on resources has seriously reduced the application of large language model in various industries,especially in the training and deployment of end-to-side mobile devices and small devices.The lightweight acceleration of large language model is an urgent problem to be solved.From the perspective of computer architecture,this paper analyzes the common methods of large language model acceleration in seven aspects:processor,input-output communication transmission,memory,video memory,operating system,algorithm and compilation,and the analysis is comprehensive and involves the software and hardware.This paper provides a way to build lightweight large language model by systematic analysis.

large language modellightweightcomputing powerprecision calculationalgorithm framework

武琼

展开 >

北京冉腾语云科技有限公司 北京 100000

大模型 轻量化 算力 精度计算 算法框架

2024

科学与信息化

科学与信息化

ISSN:
年,卷(期):2024.(1)
  • 6