首页|大语言模型主流架构的特征要求与路径构建

大语言模型主流架构的特征要求与路径构建

扫码查看
在人工智能大模型架构演进的过程中,大规模语言模型已由传统的神经网络架构逐步演变为基于Transformer的框架.近期,以Transformer为基础的仅解码器(Decoder-Only)架构在参数规模、效能及适用性方面取得了显著进步,逐渐成为大规模语言模型发展和研究的主要方向.尽管仅解码器(Decoder-Only)具备相对优势,但随着大型模型不断地发展与优化,编码器-解码器(Encoder-Decoder)模型在在特定任务上的表现仍然具有竞争力,仍需从构建安全发展策略来夯实大语言模型高质量发展的实践基础.
Feature Specifications and Pathway Development for the Predominant Architecture of a Large-scale Language Model
During the evolution of architectural design,large-scale language models have transitioned from traditional neural network structures to Transformer-based frameworks.Recently,Decoder-Only architectures based on Transformers have demonstrated significant advancements in parameter scale,performance,and versatility,emerging as a primary focus for the development and research of large-scale language models.Despite the comparative advantages of Decoder-Only models,Encoder-Decoder architectures remain competitive in task-specific performance due to continuous development and optimization of large models.Therefore,it is essential to establish a robust development strategy to solidify the practical foundation for high-quality advancement of large language models.

Large Language ModelsMainstream architectureDecoder-Only

傅文军、毛雄飞、徐晓

展开 >

浙江移动信息系统集成有限公司,浙江 杭州 310006

中国移动通信集团浙江有限公司,浙江 杭州 310016

中国移动通信集团浙江有限公司绍兴分公司,浙江 绍兴 321000

大语言模型 主流架构 仅编码器

2025

中国仪器仪表
机械工业仪器仪表综合技术经济研究所,中国仪器仪表行业协会

中国仪器仪表

影响因子:0.158
ISSN:1005-2852
年,卷(期):2025.(1)