首页|大语言模型核心架构演进态势分析

大语言模型核心架构演进态势分析

扫码查看
体系化梳理分析了基于Transformer架构的重要创新方向,从Transformer自身架构创新、与其他架构融合创新以及非Transformer算法创新3个维度分析了大语言模型算法演进态势,就未来大模型发展方向进行展望.
Analysis of large language model architecture evolution
This paper systematically reviews and analyzes the significant innovation directions based on the Transformer architecture.It examines the evolution of large language model architecture from three dimensions:innovation within the Transformer architecture itself,fusion innovation with other architectures,and innovations in non-Transformer architecture.This paper also provides an outlook on the future development directions of foundation models.

large model architectureTransformerattention mechanismarchitectural innovation

王蕴韬

展开 >

中国信息通信研究院人工智能研究所,北京 100191

大模型架构 Transformer 注意力机制 架构创新

2024

信息通信技术与政策
信息产业部电信传输研究所

信息通信技术与政策

影响因子:0.363
ISSN:2096-5931
年,卷(期):2024.50(12)