Design of Convolutional Neural Network Accelerator for Microcontroller
Aiming at the problem that the performance of embedded microcontroller is difficult to meet the task of real-time image recog-nition,a convolutional neural network accelerator suitable for microcontroller is proposed.The accelerator has a non blocking row paral-lel multiplier adder unit structure in the convolutional layer.It has higher hardware utilization.In order to meet the throughput of row parallel data,a special convolution SRAM memory is designed.The accelerator integrates pooling and activation units into the data path,effectively reducing the time overhead caused by repeated data access.Through FPGA prototype verification,the performance of the accelerator can reach 92.2 GOPS@100 MHz.The accelerator is synthesized based on TSMC 130 nm process.The dynamic power consumption of the accelerator is 33 mW,the area is 90 764.2 μm2,and the energy efficiency ratio is 2 793 GOPS/W,which is about a hundred times higher than that of FPGA accelerator.The accelerator has the characteristics of low power and cost,which is conducive to the wide application of embedded systems in the field of machine vision,such as object detection,face recognition and so on.
convolutional neural networkparallel computingpipelinehardware acceleratorapplication specific integrated circuit