大语言模型主流架构的特征要求与路径构建

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：在人工智能大模型架构演进的过程中,大规模语言模型已由传统的神经网络架构逐步演变为基于Transformer的框架.近期,以Transformer为基础的仅解码器(Decoder-Only)架构在参数规模、效能及适用性方面取得了显著进步,逐渐成为大规模语言模型发展和研究的主要方向.尽管仅解码器(Decoder-Only)具备相对优势,但随着大型模型不断地发展与优化,编码器-解码器(Encoder-Decoder)模型在在特定任务上的表现仍然具有竞争力,仍需从构建安全发展策略来夯实大语言模型高质量发展的实践基础.

外文标题：Feature Specifications and Pathway Development for the Predominant Architecture of a Large-scale Language Model

外文摘要：During the evolution of architectural design,large-scale language models have transitioned from traditional neural network structures to Transformer-based frameworks.Recently,Decoder-Only architectures based on Transformers have demonstrated significant advancements in parameter scale,performance,and versatility,emerging as a primary focus for the development and research of large-scale language models.Despite the comparative advantages of Decoder-Only models,Encoder-Decoder architectures remain competitive in task-specific performance due to continuous development and optimization of large models.Therefore,it is essential to establish a robust development strategy to solidify the practical foundation for high-quality advancement of large language models.

外文关键词：

Large Language ModelsMainstream architectureDecoder-Only

作者：

傅文军、毛雄飞、徐晓

展开 >

作者单位：

浙江移动信息系统集成有限公司,浙江杭州 310006

中国移动通信集团浙江有限公司,浙江杭州 310016

中国移动通信集团浙江有限公司绍兴分公司,浙江绍兴 321000

关键词：

大语言模型主流架构仅编码器

出版年：

2025

中国仪器仪表

机械工业仪器仪表综合技术经济研究所,中国仪器仪表行业协会

中国仪器仪表

影响因子：0.158

ISSN：1005-2852

年,卷(期)：2025.(1)