计算机工程与科学2024,Vol.46Issue(1) :1-11.DOI:10.3969/j.issn.1007-130X.2024.01.001

GNNSched:面向GPU的图神经网络推理任务调度框架

GNNSched:A GNN inference task scheduling framework on GPU

孙庆骁 刘轶 杨海龙 王一晴 贾婕 栾钟治 钱德沛
计算机工程与科学2024,Vol.46Issue(1) :1-11.DOI:10.3969/j.issn.1007-130X.2024.01.001

GNNSched:面向GPU的图神经网络推理任务调度框架

GNNSched:A GNN inference task scheduling framework on GPU

孙庆骁 1刘轶 1杨海龙 1王一晴 1贾婕 1栾钟治 1钱德沛1
扫码查看

作者信息

  • 1. 北京航空航天大学计算机学院,北京 100191
  • 折叠

摘要

由于频繁的显存访问,图神经网络GNN在GPU上运行时往往资源利用率较低.现有的推理框架由于没有考虑GNN输入的不规则性,直接适用到GNN进行推理任务共置时可能会超出显存容量导致任务失败.对于GNN推理任务,需要根据其输入特点预先分析并发任务的显存占用情况,以确保并发任务在GPU上的成功共置.此外,多租户场景提交的推理任务亟需灵活的调度策略,以满足并发推理任务的服务质量要求.为了解决上述问题,提出了 GNNSched,其在GPU上高效管理GNN推理任务的共置运行.具体来说,GNNSched将并发推理任务组织为队列,并在算子粒度上根据成本函数估算每个任务的显存占用情况.GNNSched实现了多种调度策略来生成任务组,这些任务组被迭代地提交到GPU并发执行.实验结果表明,GNNSched能够满足并发GNN推理任务的服务质量并降低推理任务的响应时延.

Abstract

Due to frequent memory access,graph neural network(GNN)often has low resource util-ization when running on GPU.Existing inference frameworks,which do not consider the irregularity of GNN input,may exceed GPU memory capacity when directly applied to GNN inference tasks.For GNN inference tasks,it is necessary to pre-analyze the memory occupation of concurrent tasks based on their input characteristics to ensure successful co-location of concurrent tasks on GPU.In addition,inference tasks submitted in multi-tenant scenarios urgently need flexible scheduling strategies to meet the quality of service requirements for con-current inference tasks.To solve these problems,this paper proposes GNNSched,which efficiently manages the co-location of GNN inference tasks on GPU.Specifically,GNNSched organizes concurrent inference tasks into a queue and estimates the memory occupation of each task based on a cost function at the operator level.GNNSched implements multiple scheduling strategies to generate task groups,which are iteratively submitted to GPU for concurrent execution.Experimental results show that GNNSched can meet the quality of service requirements for concurrent GNN inference tasks and reduce the response time of inference tasks.

关键词

图神经网络/图形处理器/推理框架/任务调度/估计模型

Key words

graph neural network(GNN)/graphic processing unit(GPU)/inference framework/task scheduling/estimation model

引用本文复制引用

基金项目

科技创新2030——"新一代人工智能"重大项目(2022ZD0117805)

国家自然科学基金(62072018)

国家自然科学基金(62322201)

国家自然科学基金(U22A2028)

中央高校基本科研业务费专项资金(YWF-23-L-1121)

出版年

2024
计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
参考文献量31
段落导航相关论文