基于图卷积的神经网络硬件加速器设计
Design of Neural Network Hardware Accelerator based on Graph Convolution
常静涛 1王仁平1
作者信息
摘要
目前很多的应用都需要用图数据来表示和处理,图数据是位于非欧几里得空间中的不规则数据,出于图数据处理的需求,图卷积神经网络(GCN)应运而生.GCN的主要处理步骤有:聚合,转换和激活.在本文中,我们采用一种异构模式对GCN的推理过程进行加速.根据数据本身的特点,在转换阶段,加速器采用脉动阵列执行计算来改善数据流,在聚合阶段,将所要处理的负载分成两种类型,有助于改善聚合阶段计算过程中的负载不平衡现象,同时在一定程度上缩短计算时间.最后,通过在Xilinx Virtex UltraScale+VU37P HBM FPGA平台上进行性能评估,本工作相对于CPU和GPU分别实现了平均389.19 ×和6.73 ×的加速.
Abstract
At present,many applications need to be represented and processed by graph data.Graph data is irregular data located in non-Euclidean space.For the needs of graph data processing,graph convolutional neural network(GCN)came into being.The main processing steps of GCN are:aggregation,transformation and activation.In this paper,we adopt a heterogeneous pattern to accelerate the inference process of GCN.According to the characteristics of the data itself,the accelerator uses a systolic array to perform calculations to improve the data flow in the transfor-mation stage.In the aggregation stage,the load to be processed is divided into two types,which helps to improve the load imbalance phenomenon in the calculation process of the aggregation stage and shorten the calculation time to a certain extent.Finally,by evaluating the performance on the Xilinx Virtex UltraScale+VU37P HBM FPGA platform,this work achieves an average speedup of 389.19 × and 6.73 × relative to the CPU and GPU,respectively.
关键词
图卷积/机器学习/硬件加速Key words
graph convolution/machine learning/hardware acceleration引用本文复制引用
出版年
2024