中国仪器仪表2025,Issue(1) :17-21.

大语言模型主流架构的特征要求与路径构建

Feature Specifications and Pathway Development for the Predominant Architecture of a Large-scale Language Model

傅文军 毛雄飞 徐晓
中国仪器仪表2025,Issue(1) :17-21.

大语言模型主流架构的特征要求与路径构建

Feature Specifications and Pathway Development for the Predominant Architecture of a Large-scale Language Model

傅文军 1毛雄飞 2徐晓3
扫码查看

作者信息

  • 1. 浙江移动信息系统集成有限公司,浙江 杭州 310006
  • 2. 中国移动通信集团浙江有限公司,浙江 杭州 310016
  • 3. 中国移动通信集团浙江有限公司绍兴分公司,浙江 绍兴 321000
  • 折叠

摘要

在人工智能大模型架构演进的过程中,大规模语言模型已由传统的神经网络架构逐步演变为基于Transformer的框架.近期,以Transformer为基础的仅解码器(Decoder-Only)架构在参数规模、效能及适用性方面取得了显著进步,逐渐成为大规模语言模型发展和研究的主要方向.尽管仅解码器(Decoder-Only)具备相对优势,但随着大型模型不断地发展与优化,编码器-解码器(Encoder-Decoder)模型在在特定任务上的表现仍然具有竞争力,仍需从构建安全发展策略来夯实大语言模型高质量发展的实践基础.

Abstract

During the evolution of architectural design,large-scale language models have transitioned from traditional neural network structures to Transformer-based frameworks.Recently,Decoder-Only architectures based on Transformers have demonstrated significant advancements in parameter scale,performance,and versatility,emerging as a primary focus for the development and research of large-scale language models.Despite the comparative advantages of Decoder-Only models,Encoder-Decoder architectures remain competitive in task-specific performance due to continuous development and optimization of large models.Therefore,it is essential to establish a robust development strategy to solidify the practical foundation for high-quality advancement of large language models.

关键词

大语言模型/主流架构/仅编码器

Key words

Large Language Models/Mainstream architecture/Decoder-Only

引用本文复制引用

出版年

2025
中国仪器仪表
机械工业仪器仪表综合技术经济研究所,中国仪器仪表行业协会

中国仪器仪表

影响因子:0.158
ISSN:1005-2852
段落导航相关论文