首页|Courier: A Unified Communication Agent to Support Concurrent Flow Scheduling in Cluster Computing

Courier: A Unified Communication Agent to Support Concurrent Flow Scheduling in Cluster Computing

扫码查看
As one of the pillars in cluster computing frameworks, coflow scheduling algorithms can effectively shorten the network transmission time of cluster computing jobs, thus reducing the job completion times and improving the execution performance. However, most of existing coflow scheduling algorithms failed to consider the influences of concurrent flows, which can degrade their performance under a massive number of concurrent flows. To fill the gap, we propose a unified communication agent named Courier to minimize the number of concurrent flows in cluster computing applications, which is compatible with the mainstream coflow scheduling approaches. To maintain the scheduling order given by the scheduling algorithms, Courier merges multiple flows between each pair of hosts into a unified flow, and determines its order based on that of origin flows. In addition, in order to adapt to various types of topologies, Courier introduces a control mechanism to adjust the number of flows while maintaining the scheduling order. Extensive large-scale trace-driven simulations have shown that Courier is compatible with existing scheduling algorithms, and outperforms the state-of-the-art approaches by about 30% under a variety of workloads and topologies.

Scheduling algorithmsCluster computingData centersTopologyServersSwitchesSchedulingTrainingNetwork topologyPacket loss

Zhaochen Zhang、Xu Zhang、Zhaoxiang Bao、Liang Wei、Chaohong Tan、Wanchun Dou、Guihai Chen、Chen Tian

展开 >

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

School of Electronic Science and Engineering, Nanjing University, Nanjing, China

Jiangsu Future Network Innovation Institute, Zhenjiang, China

2025

IEEE transactions on parallel and distributed systems

IEEE transactions on parallel and distributed systems

SCI
ISSN:
年,卷(期):2025.36(5)
  • 46