Design of convolutional neural network acceleration system based on heterogeneous platform
Deploying convolutional neural networks(CNN)on embedded devices with limited com-puting and storage resources poses challenges such as slow execution speed,low computational efficien-cy,and high power consumption.This paper proposes a novel CNN acceleration architecture based on a heterogeneous platform,and designs and implements a lightweight CNN acceleration system based on MobileNet.Firstly,to reduce hardware resource consumption and data transmission costs,a design method combining dynamic fixed-point quantization and batch normalization fusion is employed to opti-mize the network model and reduce the hardware design complexity of the acceleration system.Second-ly,by implementing convolutional block partitioning,parallel convolutional computation,and data flow optimization,the efficiency of convolutional operations and system throughput are effectively improved.Experimental results on the PYNQ-Z2 platform demonstrate that the MobileNet network inference ac-celeration scheme implemented by this acceleration system achieves a recognition time of 0.18 seconds per image and a system power consumption of 2.62 watts,representing a 128-fold improvement in acce-leration performance compared to an ARM single-core processor.
field programmable gate array(FPGA)Vivado high level synthesisconvolutional neu-ral networkheterogeneous platformhardware acceleration