首页|GNNSched:面向GPU的图神经网络推理任务调度框架

GNNSched:面向GPU的图神经网络推理任务调度框架

扫码查看
由于频繁的显存访问,图神经网络GNN在GPU上运行时往往资源利用率较低。现有的推理框架由于没有考虑GNN输入的不规则性,直接适用到GNN进行推理任务共置时可能会超出显存容量导致任务失败。对于GNN推理任务,需要根据其输入特点预先分析并发任务的显存占用情况,以确保并发任务在GPU上的成功共置。此外,多租户场景提交的推理任务亟需灵活的调度策略,以满足并发推理任务的服务质量要求。为了解决上述问题,提出了 GNNSched,其在GPU上高效管理GNN推理任务的共置运行。具体来说,GNNSched将并发推理任务组织为队列,并在算子粒度上根据成本函数估算每个任务的显存占用情况。GNNSched实现了多种调度策略来生成任务组,这些任务组被迭代地提交到GPU并发执行。实验结果表明,GNNSched能够满足并发GNN推理任务的服务质量并降低推理任务的响应时延。
GNNSched:A GNN inference task scheduling framework on GPU
Due to frequent memory access,graph neural network(GNN)often has low resource util-ization when running on GPU.Existing inference frameworks,which do not consider the irregularity of GNN input,may exceed GPU memory capacity when directly applied to GNN inference tasks.For GNN inference tasks,it is necessary to pre-analyze the memory occupation of concurrent tasks based on their input characteristics to ensure successful co-location of concurrent tasks on GPU.In addition,inference tasks submitted in multi-tenant scenarios urgently need flexible scheduling strategies to meet the quality of service requirements for con-current inference tasks.To solve these problems,this paper proposes GNNSched,which efficiently manages the co-location of GNN inference tasks on GPU.Specifically,GNNSched organizes concurrent inference tasks into a queue and estimates the memory occupation of each task based on a cost function at the operator level.GNNSched implements multiple scheduling strategies to generate task groups,which are iteratively submitted to GPU for concurrent execution.Experimental results show that GNNSched can meet the quality of service requirements for concurrent GNN inference tasks and reduce the response time of inference tasks.

graph neural network(GNN)graphic processing unit(GPU)inference frameworktask schedulingestimation model

孙庆骁、刘轶、杨海龙、王一晴、贾婕、栾钟治、钱德沛

展开 >

北京航空航天大学计算机学院,北京 100191

图神经网络 图形处理器 推理框架 任务调度 估计模型

科技创新2030——"新一代人工智能"重大项目国家自然科学基金国家自然科学基金国家自然科学基金中央高校基本科研业务费专项资金

2022ZD01178056207201862322201U22A2028YWF-23-L-1121

2024

计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
年,卷(期):2024.46(1)
  • 31