计算机工程与设计2024,Vol.45Issue(1) :71-78.DOI:10.16208/j.issn1000-7024.2024.01.010

用于VSLAM系统的CNN在FPGA平台上的加速

Acceleration of CNN for VSLAM system on FPGA platform

郁媛 李沛君 王光奇 张德兵 张春
计算机工程与设计2024,Vol.45Issue(1) :71-78.DOI:10.16208/j.issn1000-7024.2024.01.010

用于VSLAM系统的CNN在FPGA平台上的加速

Acceleration of CNN for VSLAM system on FPGA platform

郁媛 1李沛君 1王光奇 1张德兵 1张春1
扫码查看

作者信息

  • 1. 清华大学集成电路学院,北京 100084
  • 折叠

摘要

为实现视觉同步定位与建图系统中卷积神经网络在FPGA上的加速,基于SuperPoint模型设计一种低功耗高效CNN加速器及相应的SoC系统.采用循环分块、数据复用、计算单元展开和双缓冲策略充分利用加速器的片上资源;为提高突发传输效率,预先对权重参数重排;提出Pack模块和Unpack模块,设计多通道数据传输,用于提高传输带宽.在Ultra96-V2 FPGA平台上部署整个SoC系统,在仅3 W左右的功耗下实现25.63 GOPS的吞吐量,其BRAM效率、DSP效率、性能密度和功耗效率相比之前的文献有明显优势.

Abstract

To realize the acceleration of convolutional neural network in visual simultaneous localization and mapping system on FPGA,a low-power and efficient CNN accelerator and its corresponding SOC system were designed based on SuperPoint model.Loop tiling,data reuse,parallel computation and double buffer strategies were adopted to make full use of the on-chip resources.To improve the burst transmission efficiency,the weight parameters were rearranged in advance.Pack module and unpack module were proposed,and multi-channel data transmission was designed to improve the data bandwidth.The whole SoC system is deployed on the Ultra96-V2 FPGA platform and a peak performance of 25.63 GOPS is achieved with only about 3 W power consumption.Its BRAM efficiency,DSP efficiency,performance density and energy efficiency have obvious advantages over previous work.

关键词

同步定位与建图系统/图像处理/卷积加速/数据复用/并行计算/突发传输/软硬件协作

Key words

visual simultaneous localization and mapping system/image processing/convolution acceleration/data reuse/parallel computation/burst transmission/software and hardware collaboration

引用本文复制引用

基金项目

国家自然科学基金项目(U20A20220)

出版年

2024
计算机工程与设计
中国航天科工集团二院706所

计算机工程与设计

CSTPCD北大核心
影响因子:0.617
ISSN:1000-7024
参考文献量3
段落导航相关论文