面向大模型的智算网络发展研究

Research on the development of intelligent computing network for large models

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：近年来,全球进入智能计算的蓬勃发展期,作为具有巨量参数和复杂结构的深度学习模型,大模型训练需要在多卡、多服务器间实现训练参数的快速同步,所以对算力中心网络的带宽、时延、可靠性、可扩展性和安全性等提出更高要求.研究了面向大模型训练的智算网络的需求和相关关键技术,对智算网络的研究成果、标准规范和案例实践进行了分析,以期进一步促进智算网络的发展.

外文摘要：In recent years,the world has entered a period of vigorous development in intelligent computing.As deep learning models with huge parameters and complex structures,large model training requires fast synchronization of training parameters between multiple cards and servers,which imposes higher requirements on the bandwidth,la-tency,reliability,scalability and security of datacenter networks.The requirements and related key technologies of in-telligent computing networks for large model training were studied,and the standard specifications,academic re-search,and case practices of intelligent computing networks were analyzed,in order to promote the development of intelligent computing networks.

外文关键词：

large modelintelligent computing centernetwork technology

作者：

郭亮、王少鹏、权伟、李洁

展开 >

作者单位：

中国信息通信研究院云计算与大数据研究所,北京 100191

北京交通大学电子信息工程学院,北京 100044

关键词：

大模型智算中心网络技术

基金：

新一代人工智能国家科技重大项目

项目编号：

2021ZD0113003

出版年：

2024

DOI：

10.11959/j.issn.1000-0801.2024147

电信科学

中国通信学会　人民邮电出版社

电信科学

CSTPCD北大核心

影响因子：0.902

ISSN：1000-0801

年,卷(期)：2024.40(6)

参考文献量4