首页|适应于硬件部署的神经网络剪枝量化算法

适应于硬件部署的神经网络剪枝量化算法

扫码查看
深度神经网络由于性能优异已经在图像识别、目标检测等领域广泛应用,然而其包含大量参数和巨大计算量,导致在需要低延时和低功耗的移动边缘端部署时困难.针对该问题,提出一种用移位加法代替乘法运算的压缩算法,通过对神经网络进行剪枝和量化将参数压缩至低比特.该算法在乘法资源有限的情况下降低了硬件部署难度,可满足移动边缘端低延时和低功耗的要求,提高运行效率.对ImageNet数据集经典神经网络进行了实验,结果表明神经网络的参数在压缩到4 bit的情况下,其准确率与全精度神经网络的基本一致,甚至在ResNet18、ResNet50和GoogleNet网络上的Top-1/Top-5准确率还分别提升了 0.38%/0.22%,0.35%/0.21%和1.14%/0.57%.对VGG16第8层卷积层进行实验,将其部署在Zynq7035上,结果表明,压缩后的网络在使用的DSP资源减少43%的情况下缩短了 51.1%的推理时间,并且减少了 46.7%的功耗.
A neural network pruning and quantization algorithm for hardware deployment
Due to their superior performance,deep neural networks have been widely applied in fields such as image recognition and object detection.However,they contain a large number of parameters and require immense computational power,posing challenges for deployment on mobile edge devices that re-quire low latency and low power consumption.To address this issue,a compression algorithm that re-places multiplication operations with bit-shifting and addition is proposed.This algorithm compresses neural network parameters to low bit-widths through pruning and quantization.This algorithm reduces the hardware deployment difficulty under limited multiplication resources,meets the requirements of low latency and low power consumption on mobile edge devices,and improves operational efficiency.Experiments conducted on classical neural networks with the ImageNet dataset revealed that when the neural network parameters were compressed to 4 bits,the accuracy remained essentially unchanged com-pared to the full-precision neural network.Furthermore,for ResNetl8,ResNet50,and GoogleNet,the Top-1/Top-5 accuracies even improved by0.38%/0.22%,0.35%/0.21%,and 1.14%/0.57%,respec-tively.When testing the eighth convolutional layer of VGG16 deployed on Zynq7035,the results showed that the compressed network reduced the inference time by 51.1%and power consumption by 46.7%,while using 43%fewer DSP resources.

deep neural networkshardwarepruningquantizationFPGA

王鹏、张嘉诚、范毓洋

展开 >

中国民航大学民航航空器适航审定技术重点实验室,天津 300399

中国民航大学安全科学与工程学院,天津 300399

深度神经网络 硬件 剪枝 量化 FPGA

2024

计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
年,卷(期):2024.46(9)