首页|DBiT: A High-Precision Binarized ViT FPGA Accelerator
DBiT: A High-Precision Binarized ViT FPGA Accelerator
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
Springer Nature
Vision Transformer (ViT) has shown great promise in image processing. However, its large model parameters and computation complexity result in inference delays, making deployment on edge devices challenging. To overcome these issues, various model compression techniques like quantization and distillation have been developed. Previous studies have explored quantization and binarization of ViT, but their effectiveness in minimizing accuracy loss has been limited, and primarily focusing on software solutions. Research on hardware acceleration remains underexplored but is essential for boosting the inference speed of binarized networks. This paper proposes a hardware acceleration scheme for binarized ViT using a distribution matching layer. Our approach starts with an experimental and theoretical analysis of the data distribution in binarized ViT, leading to the introduction of a distribution matching layer post-binarization. We also design a compatible model storage scheme and a hardware acceleration algorithm to enhance the efficiency of weight matrix storage and computation. Additionally, optimizing large matrix multiplication within the self-attention layer significantly improves overall model speed. Experimental results show that our method increases accuracy by 10% compared to traditional binarized ViT approaches with learning factors, reducing the accuracy gap between binarized and full-precision models to 4%. Furthermore, our approach achieves inference speeds approximately 45 times faster than traditional models.
Shanghai Advanced Research Institute, Chinese Academy of Science, Shanghai 201210, PR China||University of Chinese Academy of Science, Beijing 100049, PR China