信息通信技术与政策2024,Vol.50Issue(12) :13-20.DOI:10.12267/j.issn.2096-5931.2024.12.003

大语言模型核心架构演进态势分析

Analysis of large language model architecture evolution

王蕴韬
信息通信技术与政策2024,Vol.50Issue(12) :13-20.DOI:10.12267/j.issn.2096-5931.2024.12.003

大语言模型核心架构演进态势分析

Analysis of large language model architecture evolution

王蕴韬1
扫码查看

作者信息

  • 1. 中国信息通信研究院人工智能研究所,北京 100191
  • 折叠

摘要

体系化梳理分析了基于Transformer架构的重要创新方向,从Transformer自身架构创新、与其他架构融合创新以及非Transformer算法创新3个维度分析了大语言模型算法演进态势,就未来大模型发展方向进行展望.

Abstract

This paper systematically reviews and analyzes the significant innovation directions based on the Transformer architecture.It examines the evolution of large language model architecture from three dimensions:innovation within the Transformer architecture itself,fusion innovation with other architectures,and innovations in non-Transformer architecture.This paper also provides an outlook on the future development directions of foundation models.

关键词

大模型架构/Transformer/注意力机制/架构创新

Key words

large model architecture/Transformer/attention mechanism/architectural innovation

引用本文复制引用

出版年

2024
信息通信技术与政策
信息产业部电信传输研究所

信息通信技术与政策

影响因子:0.363
ISSN:2096-5931
段落导航相关论文