大语言模型核心架构演进态势分析
Analysis of large language model architecture evolution
王蕴韬1
作者信息
- 1. 中国信息通信研究院人工智能研究所,北京 100191
- 折叠
摘要
体系化梳理分析了基于Transformer架构的重要创新方向,从Transformer自身架构创新、与其他架构融合创新以及非Transformer算法创新3个维度分析了大语言模型算法演进态势,就未来大模型发展方向进行展望.
Abstract
This paper systematically reviews and analyzes the significant innovation directions based on the Transformer architecture.It examines the evolution of large language model architecture from three dimensions:innovation within the Transformer architecture itself,fusion innovation with other architectures,and innovations in non-Transformer architecture.This paper also provides an outlook on the future development directions of foundation models.
关键词
大模型架构/Transformer/注意力机制/架构创新Key words
large model architecture/Transformer/attention mechanism/architectural innovation引用本文复制引用
出版年
2024