首页|大语言模型算法演进综述

大语言模型算法演进综述

扫码查看
基于Transformer架构的大语言模型展现出强大的能力,是人类迈向通用人工智能(AGI)的一个重大进步。大语言模型架构和算法的演进分为提高推理效率、提高模型能力两条技术路线。介绍了两条技术路线主流的技术方案和思路。提高推理效率的方法有分布式推理、计算优化、访存优化、量化等;提高模型能力主要是引入新的架构,如混合专家(MoE)模型、状态空间模型(SSM)等。
Review of Evolution of Large Language Model Algorithms
The large language model based on the Transformer architecture shows powerful capabilities,and it is a major progress towards artifi-cial general intelligence(AGI).The evolution of large language model architecture and algorithms is divided into two technical paths:improving the inference efficiency and model capability.The mainstream technical solutions and ideas for the two technical routes are described.Meth-ods for improving inference efficiency include distributed inference,computing optimization,memory access optimization,and quantification.To improve model capabilities,new architectures such as mixture of experts(MoE)and state space model(SSM)are introduced.

large language modelTransformerattention

朱炫鹏、姚海东、刘隽、熊先奎

展开 >

中兴通讯股份有限公司,中国 深圳 518057

大语言模型 Transformer 注意力

2024

中兴通讯技术
中兴通讯股份有限公司,安徽科学技术情报研究所

中兴通讯技术

CSTPCD北大核心
影响因子:1.272
ISSN:1009-6868
年,卷(期):2024.30(2)
  • 59