Research on Deep Separable Convolution Accelerator Based On FPGA
A low power deep separable convolution accelerator kernel based on FPGA is designed.According to the commonality of Pointwise(PW)convolution and Depthwise(DW)convolution calculations,the fixed multiplicative array is used to realize the two convolution calculation structures by changing the feature and weight input data stream,so as to maximize the utilization of DSP.In order to solve the problem that the sign bit may overflow in the 8-bit asymmetric quantization,the double multiplier structure is re-packaged by using the sign bit processing method.The parallelism of data processing in each cycle is guaranteed by the 7-level pipelin-ing structure in the layer.The accelerator structure is successfully deployed on the Zynq UltraScale+series FPGA;Through the ex-perimental test,the results show that the proposed acceleration structure can improve the inference speed of network and reduce the dependence of on-chip resources and overall power consumption.The average throughput of the original MobilenetV2 on the proposed FPGA accelerator is as high as 130.6 GOPS,and the overall power consumption is only 4.1 w,which meets the requirements of real-time edge computing.Compared with other hardware platforms,the energy efficiency ratio is significantly improved;Compared with the same type of accelerator on the FPGA,it has advantages of performance density(GOPS/LUT),power efficiency(GOPS/W)and DSP efficiency(GOPS/DSP).