Design and implementation of dual-mode configurable memory architecture for CNN accelerator

山蕊 ¹LI Xiaoshuo ¹GAO Xu ¹HUO Ziqing¹

扫码查看

作者信息

1. School of Electronic Engineering,Xi'an University of Posts and Telecommunications,Xi'an 710121,P.R.China
折叠

Abstract

With the rapid development of deep learning algorithms,the computational complexity and functional diversity are increasing rapidly.However,the gap between high computational density and insufficient memory bandwidth under the traditional von Neumann architecture is getting worse.Analyzing the algorithmic characteristics of convolutional neural network(CNN),it is found that the access characteristics of convolution(CONV)and fully connected(FC)operations are very differ-ent.Based on this feature,a dual-mode reconfigurable distributed memory architecture for CNN ac-celerator is designed.It can be configured in Bank mode or first input first output(FIFO)mode to accommodate the access needs of different operations.At the same time,a programmable memory control unit is designed,which can effectively control the dual-mode configurable distributed memory architecture by using customized special accessing instructions and reduce the data accessing delay.The proposed architecture is verified and tested by parallel implementation of some CNN algorithms.The experimental results show that the peak bandwidth can reach 13.44 GB·s-1 at an operating frequency of 120 MHz.This work can achieve 1.40,1.12,2.80 and 4.70 times the peak band-width compared with the existing work.

Key words

distributed memory structure/neural network accelerator/reconfigurable array processor/configurable memory structure

引用本文复制引用

基金项目

国家重点研发计划(2022ZD0119001)

国家自然科学基金(61834005)

国家自然科学基金(61802304)

Education Department of Shaanxi Province(22JY060)

Shaanxi Provincial Key Research and Development Plan(2024GX-YBXM-100)

出版年

2024

高技术通讯(英文版)

中国科学技术信息研究所（ISTIC）

高技术通讯(英文版)

影响因子：0.058

ISSN：1006-6748

参考文献量30

段落导航