计算机工程与科学2024,Vol.46Issue(2) :209-216.DOI:10.3969/j.issn.1007-130X.2024.02.003

面向高性能计算的互连网络拥塞控制分析与评估

Analysis and evaluation of congestion control in interconnection networks for high performance computing

孙岩 张建民 黎渊 孙舜禹
计算机工程与科学2024,Vol.46Issue(2) :209-216.DOI:10.3969/j.issn.1007-130X.2024.02.003

面向高性能计算的互连网络拥塞控制分析与评估

Analysis and evaluation of congestion control in interconnection networks for high performance computing

孙岩 1张建民 1黎渊 1孙舜禹1
扫码查看

作者信息

  • 1. 国防科技大学计算机学院,湖南 长沙 410073
  • 折叠

摘要

随着高性能计算技术的发展,高性能计算系统的网络节点数量不断增长,高性能计算应用对网络性能的要求越来越高,高性能互连网络的拥塞控制面临很大的压力与挑战.针对高性能计算互连网络的特点,研究高效、低开销的拥塞控制方法,是确保高性能互连网络性能和稳定性的关键.针对高性能计算系统中互连通信的核心问题,对主流的拥塞控制方法进行分析和实验比较;基于高性能计算系统的结构特点和通信特性,设计用于大规模模拟仿真的数据流模型和流文件生成工具,并提出一种拥塞控制综合评价指标;使用所提出的数据流模型,在较大规模网络中对不同拥塞控制方法进行模拟,并基于所提出的评价指标对几种拥塞控制方法的性能进行分析和评估.提出的分析和评估技术可以对高性能互连网络的拥塞控制方法进行更客观和准确的分析与评估.

Abstract

With the development of high performance computing technology,the number of network nodes in high performance computing systems is continuously growing,and the requirements of high performance computing applications for network performance are becoming increasingly stringent.Therefore,congestion control for high performance interconnection networks faces great pressure and challenges.To address the characteristics of high performance computing interconnection networks,re-searching efficient and low-overhead congestion control methods is crucial to ensuring the performance and stability of high performance interconnection networks.This study focuses on the core issues of in-terconnection communication in high performance computing systems.It analyzes and compares the ma-instream congestion control methods.Based on the structural characteristics and communication proper-ties of high performance computing systems,it designs a data flow model and a flow file generation tool for large-scale simulation,and proposes a comprehensive evaluation index for congestion control.Using the proposed data flow model,different congestion control methods are simulated on a large-scale net-work,and their performance is analyzed and evaluated based on the proposed evaluation index.The analysis and evaluation techniques proposed in this study can provide more objective and accurate analy-sis and evaluation of congestion control methods for high performance interconnection networks.

关键词

高性能计算/拥塞控制/流量控制/RDMA网络

Key words

high performance computing/congestion control/traffic control/RDMA network

引用本文复制引用

基金项目

国家重点研发计划(2022YFB2803405)

国防科技重点实验室项目(WDZC20235250114)

出版年

2024
计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
参考文献量12
段落导航相关论文