航空兵器2024,Vol.31Issue(1) :23-31.DOI:10.12132/ISSN.1673-5048.2023.0138

多GPU系统的高速互联技术与拓扑发展现状研究

Research on the Development Status of High Speed Interconnection Technologies and Topologies of Multi-GPU Systems

崔晨 吴迪 陶业荣 赵艳丽
航空兵器2024,Vol.31Issue(1) :23-31.DOI:10.12132/ISSN.1673-5048.2023.0138

多GPU系统的高速互联技术与拓扑发展现状研究

Research on the Development Status of High Speed Interconnection Technologies and Topologies of Multi-GPU Systems

崔晨 1吴迪 1陶业荣 1赵艳丽1
扫码查看

作者信息

  • 1. 中国人民解放军 63891 部队,河南 洛阳 471003
  • 折叠

摘要

多GPU系统通过横向扩展实现性能提升,以满足人工智能日趋复杂的算法和持续激增的数据所带来的不断增长的计算需求.对于多GPU系统而言,处理器间的互联带宽以及系统的拓扑是决定系统性能的关键因素.在传统的基于PCIe的多GPU系统中,PCIe带宽是限制系统性能的瓶颈.当前,面向GPU的高速互联技术成为解决多GPU系统带宽限制问题的有效方法.本文首先介绍了传统多GPU系统所采用的PCIe互联技术及其典型拓扑,然后以Nvidia NVLink、AMD Infinity Fabric Link、Intel Xe Link、壁仞科技BLink为例,对国内外代表性GPU厂商的面向GPU的高速互联技术及其拓扑进行了梳理分析,最后讨论了关于互联技术的研究启示.

Abstract

Multi GPU systems achieve performance improvement through scaling out to meet the ever-in-creasing computation demand brought about by increasingly complex algorithms and the continuously increasing data in artificial intelligence.The interconnection bandwidth between processors,as well as topologies of sys-tems are the key factors that determine the performance of multi-GPU systems.In traditional PCIe-based multi-GPU systems,the PCIe bandwidth is the bottleneck that limits system performance.GPU-oriented high speed interconnection technologies become an effective method to solve the bandwidth limitation problem of multi-GPU systems at present.This article first introduces the PCIe interconnection technology and the typical topologies used in traditional multi-GPU systems.Then taking Nvidia NVLink,AMD Infinity Fabric Link,Intel Xe Link,and Biren Technology BLink as examples,GPU-oriented high speed interconnection technologies and topologies of representative GPU vendors at home and abroad are reviewed and analyzed.Finally,the research implication of interconnection technologies is discussed.

关键词

多GPU系统/高速互联技术/拓扑/互联带宽/数据中心

Key words

multi-GPU system/high speed interconnection technology/topology/interconnection band-width/data center

引用本文复制引用

出版年

2024
航空兵器
中国空空导弹研究院

航空兵器

CSTPCD北大核心
影响因子:0.453
ISSN:1673-5048
被引量1
参考文献量35
段落导航相关论文