首页|一种基于三角数分解的可配置2-D卷积器优化方法

一种基于三角数分解的可配置2-D卷积器优化方法

扫码查看
多尺寸2-D卷积通过特征提取在检测、分类等计算机视觉任务中发挥着重要作用.然而,目前缺少一种高效的可配置2-D卷积器设计方法,这限制了卷积神经网络(CNN)模型在边缘端的部署和应用.该文基于乘法管理以及奇平方数的三角数分解方法,提出一种高性能、高适应性的卷积核尺寸可配置的2-D卷积器.所提2-D卷积器包含一定数量的处理单元(PE)以及相应的控制单元,前者负责运算任务,后者负责管理乘法运算的组合,二者结合以实现不同尺寸的卷积.具体地,首先根据应用场景确定一个奇数列表,列表中为2-D卷积器所支持的尺寸,并利用三角数分解得到对应的三角数列表;其次,根据三角数列表和计算需求,确定PE的总数量;最后,基于以小凑大的方法,确定PE的互连方式,完成电路设计.该可配置2-D卷积器通过Verilog硬件描述语言(HDL)设计实现,由Vivado 2022.2在XCZU7EG板卡上进行仿真和分析.实验结果表明,相比同类方法,该文所提可配置2-D卷积器,乘法资源利用率得到显著提升,由20%~50%提升至89%,并以514个逻辑单元实现1 500 MB/s的吞吐率,具有广泛的适用性.
A Reconfigurable 2-D Convolver Based on Triangular Numbers Decomposition
Two-Dimensional(2-D)convolution with different kernel sizes enriches the overall performance in computer vision tasks.Currently,there is a lack of an efficient design method of reconfigurable 2-D convolver,which limits the deployment of Convolution Neural Network(CNN)models at the edge.In this paper,a new approach based on multiplication management and triangular numbers decomposition is proposed.The proposed 2-D convolver includes a certain number of Processing Elements(PE)and corresponding control units,where the former is responsible for computing tasks and the latter manages the combination of multiplication operations to achieve different convolution sizes.Specifically,an odd number list is determined based on the application scenario,which represents the supported sizes of the 2-D convolutional kernel.The corresponding triangular number list is obtained using the triangular numbers decomposition method.Then,the total number of PEs is determined based on the triangular number list and computational requirements.Finally,the corresponding control units and the interconnection of PEs are determined by the addition combinations of triangular numbers.The proposed reconfigurable 2-D convolver is designed by Verilog Hardware Description Language(HDL)and implemented by Vivado 2022.2 software on the XCZU7EG board.Compared with similar methods,the proposed 2-D convolver significantly improves the efficiency of multiplication resources,increasing from 20%~50%to 89%,and achieves a throughput of 1 500 MB/s with 514 logic units,thereby demonstrating its wide applicability.

2-D convolverReconfigurable architectureMultiplication managementTriangular numbers decomposition

黄继业、肖强、田大海、高明裕、王俊帆、董哲康、黄汐威

展开 >

杭州电子科技大学电子信息学院 杭州 310018

浙江省装备电子研究重点实验室 杭州 310018

浙江大学电气工程学院 杭州 310027

2-D卷积器 可配置架构 乘法管理 三角数分解

国家重点研发计划

2022YFD2000100

2024

电子与信息学报
中国科学院电子学研究所 国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCD北大核心
影响因子:1.302
ISSN:1009-5896
年,卷(期):2024.46(7)
  • 1