首页|Convolutional neural network adaptation and optimization method in SIMT computing mode

Convolutional neural network adaptation and optimization method in SIMT computing mode

扫码查看
For studying and optimizing the performance of general-purpose computing on graphics processing units(GPGPU)based on single instruction multiple threads(SIMT)processor about the neural network application,this work contributes a self-developed SIMT processor named Pomelo and correlated assembly program.The parallel mechanism of SIMT computing mode and self-developed Pomelo processor is briefly introduced.A common convolutional neural network(CNN)is built to verify the compatibility and functionality of the Pomelo processor.CNN computing flow with task level and hardware level optimization is adopted on the Pomelo processor.A specific algorithm for organizing a Z-shaped memory structure is developed,which addresses reducing memory access in mass data computing tasks.Performing the above-combined adaptation and optimization strategy,the experimental result demonstrates that reducing memory access in SIMT computing mode plays a crucial role in improving performance.A 6.52 times performance is achieved on the 4 processing elements case.

parallel computingsingle instruction multiple threads(SIMT)convolutional neural network(CNN)memory optimization

Feng Zhenfu、Zhang Yaying、Yang Lele、Xing Lidong

展开 >

School of Electronic Engineering,Xi'an University of Posts and Telecommunications,Xi'an 710121,China

陕西省教育厅科研项目

20JY058

2024

中国邮电高校学报(英文版)
北京邮电大学

中国邮电高校学报(英文版)

影响因子:0.419
ISSN:1005-8885
年,卷(期):2024.31(2)
  • 9