电子测量技术2024,Vol.47Issue(5) :1-8.DOI:10.19651/j.cnki.emt.2415345

基于FPGA的稀疏卷积神经网络加速器设计

Accelerator design of sparse convolutional neural network based on FPGA

李宁 肖昊
电子测量技术2024,Vol.47Issue(5) :1-8.DOI:10.19651/j.cnki.emt.2415345

基于FPGA的稀疏卷积神经网络加速器设计

Accelerator design of sparse convolutional neural network based on FPGA

李宁 1肖昊1
扫码查看

作者信息

  • 1. 合肥工业大学微电子学院 合肥 230601
  • 折叠

摘要

剪枝是一种减少卷积神经网络权重和计算量的有效方法,为CNN的高效部署提供了解决方案.但是,剪枝后的稀疏CNN中权重的不规则分布使硬件计算单元之间的计算负载各不相同,降低了硬件的计算效率.文章提出一种细粒度的CNN模型剪枝方法,该方法根据硬件加速器的架构将整体权重分成若干个局部权重组,并分别对每一组局部权重进行独立剪枝,得到的稀疏CNN在加速器上实现了计算负载平衡.此外,设计一种具有高效PE结构和稀疏度可配置的稀疏CNN加速器并在FPGA上实现,该加速器的高效PE结构提升了乘法器的吞吐率,同时可配置性使其可灵活地适应不同稀疏度的CNN计算.实验结果表明,提出的剪枝算法可将CNN的权重参数减少50%~70%,同时精度损失不到3%.相比于密集型加速器,提出的加速器最高可实现3.65倍的加速比;与其他的稀疏型加速器研究相比,本研究的加速器在硬件效率上提升28%~167%.

Abstract

Pruning is an effective approach to reduce weight and computation of convolutional neural network,which provides a solution for the efficient implementation of CNN. However,the irregular distribution of weight in the pruned sparse CNN also makes the workloads among the hardware computing units different,which reduces the computing efficiency of the hardware. In this paper,a fine-grained CNN model pruning method is proposed,which divides the overall weight into several local weight groups according to the architecture of the hardware accelerator. Then each group of local weights is pruned independently respectively,and the sparse CNN obtained is workload-balancing on the accelerator. Furthermore,a sparse CNN accelerator with efficient PE and configurable sparsity is designed and implemented on FPGA. The efficient PE improves the throughput of the multiplier,and the configurability makes it flexible to compute CNN with different sparsity. Experimental results show that the presented pruning algorithm can reduce the weight parameters of CNN by 70% and the accuracy loss is less than 3%. Compared to dense accelerator research,the accelerator proposed in this paper achieves up to 3.65x speedup. The accelerator improves the hardware efficiency by 28~167% compared with other sparse accelerators.

关键词

卷积神经网络/硬件加速器/稀疏计算/FPGA

Key words

convolutional neural network/hardware accelerator/sparse computing/FPGA

引用本文复制引用

基金项目

国家自然科学基金(61974039)

出版年

2024
电子测量技术
北京无线电技术研究所

电子测量技术

CSTPCD北大核心
影响因子:1.166
ISSN:1002-7300
参考文献量8
段落导航相关论文