大模型关键技术与应用

Key Technologies and Applications of Large Models

韩炳涛 ¹刘涛¹

扫码查看

作者信息

1. 中兴通讯股份有限公司,中国深圳 518057;移动网络和移动多媒体技术国家重点实验室,中国深圳 518055
折叠

摘要

介绍了自ChatGPT发布以来,大模型关键技术和应用的主要进展.在大模型设计方面,模型规模不断增加,但已有放缓趋势,更长的上下文以及多模态已经成为主流,计算效率明显提升;在模型训练方面,从单纯追求数据数量逐渐转变为关注数据的多样性和质量,特别是如何使用合成数据训练大模型成为主流探索方向,这是迈向通用人工智能(AGI)的关键;在模型推理方面,模型量化和推理引擎优化极大降低了模型使用成本,诸如投机采样等新兴算法逐渐成熟.在应用层,Agent技术获得了重大进展,在克服大模型固有缺陷方面发挥了不可替代的作用.越来越多的企业开始规划、研发以及使用大模型,企业级大模型应用架构日益成熟完善,并以场景、技术、算法三要素为抓手加速大模型商业价值闭环.

Abstract

The major advances in key technologies and applications of large models since the release of ChatGPT are presented.In terms of large model design,the model scale is increasing,but it has slowed down.Longer context and multi-mode have become the mainstream,and the computational efficiency has been significantly improved.In terms of model training,the focus has shifted from simply seeking a larger quantity of data to a more focused approach on the diversity and quality of data,especially how to train large models using synthetic data.This is an essential direction towards achieving artificial general intelligence(AGI).In terms of model inference,model quantification and inference engine optimization greatly reduce the cost of model use,and emerging algorithms such as speculative sampling gradually ma-ture.At the application level,Agent technology has made significant progress,playing a critical role in addressing the inherent limitations of large models.More and more enterprises are beginning to plan,develop,and utilize large models,and the enterprise-level large model appli-cation architecture is becoming increasingly mature,focusing on scenarios,technologies,and algorithms to accelerate the closing loop of large model commercial value.

关键词

大模型/模型训练/推理加速/大模型安全/智能体

Key words

large model/model training/inference accelerating/large model safety/Agent

引用本文复制引用

出版年

2024

中兴通讯技术

中兴通讯股份有限公司,安徽科学技术情报研究所

中兴通讯技术

CSTPCD北大核心

影响因子：1.272

ISSN：1009-6868

参考文献量47

段落导航