OpBench:an operator-level GPU benchmark for deep learning

Qingwen GU ¹Bo FAN ²Zhengning LIU ³Kaicheng CAO ²Songhai ZHANG ¹Shimin HU¹

扫码查看

作者信息

1. Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China
2. National Innovation Institute of Defense Technology,Academy of Military Science,Beijing 100071,China
3. Beijing Fitten Technology Co.,Ltd.,Beijing 100080,China
折叠

Abstract

Operators(such as Conv and ReLU)play an important role in deep neural networks.Every neural network is composed of a series of differentiable operators.However,existing AI benchmarks mainly focus on accessing model training and inference performance of deep learning systems on specific models.To help GPU hardware find computing bottlenecks and intuitively evaluate GPU performance on specific deep learning tasks,this paper focuses on evaluating GPU performance at the operator level.We statistically analyze the information of operators on 12 representative deep learning models from six prominent AI tasks and provide an operator dataset to show the different importance of various types of operators in different networks.An operator-level benchmark,OpBench,is proposed on the basis of this dataset,allowing users to choose from a given range of models and set the input sizes according to their demands.This benchmark offers a detailed operator-level performance report for AI and hardware developers.We also evaluate four GPU models on OpBench and find that their performances differ on various types of operators and are not fully consistent with the performance metric FLOPS(floating point operations per second).

Key words

deep learning/operator benchmark/model benchmark/GPU performance analysis/deep neural networks

引用本文复制引用

出版年

2024

中国科学:信息科学(英文版)

中国科学院

中国科学:信息科学(英文版)

CSTPCDEI

影响因子：0.715

ISSN：1674-733X

段落导航