首页|基于Zynq平台的低功耗人脸检测加速系统

基于Zynq平台的低功耗人脸检测加速系统

扫码查看
基于CPU及GPU的卷积神经网络平台存在体积大、能耗高等问题,提出了一种基于Zynq平台的卷积神经网络人脸检测加速系统.该系统采用YOLOv3-Tiny算法,并利用Wider Face人脸数据集进行训练.为提高网络效率,采用层融合技术减小网络深度,加快检测速度;同时,采用8位整数量化策略,以降低内存访问量,减少资源消耗.通过利用ZynqXC7Z035芯片上FPGA端并行计算能力,设计出可重复利用的多通道卷积计算模块,实现DSP的重复递用.实验结果显示,所设计的加速系统实现了 9.5 FPS的实时推理速度,检测速度是intel i7-8700CPU的7.9倍,系统功耗仅为2.65 W,满足低功耗的性能需求.
Low-power Face Detection Acceleration System Based on the Zynq Platform
To address the issues of large size and high power consumption in CPU-and GPU-based convolutional neural network platforms,we designed and implemented a convolutional neural network-assisted face detection acceleration system based on the Zynq platform in this study.We adopted the YOLOv3-Tiny algorithm for the proposed system and used the WIDER FACE dataset for training.To improve the network efficiency,we utilized a layer-fusion technique for reducing the network depth and accelerating detection.Moreover,we employed an 8-bit integer quantization strategy to minimize memory access and resource consumption.We designed a reusable multichannel convolution computation module by leveraging the parallel computing capability of field-programmable gate arrays(FPGAs)on the ZynqXC7Z035 chip to reuse the digital signal processor(DSP).The experimental results showed that our designed acceleration system,which could achieve a real-time inference speed of 9.5 FPS,was 7.9 times faster than intel i7-8700CPU and consumed only 2.65 W of power,satisfying the performance requirement of low power consumption.

convolutional neural networklayer fusionquantizationmultichannel convolutionFPGA

赵民、徐胜、韩路宇、林志贤

展开 >

福州大学物理与信息工程学院,福州 350116

中国福建光电信息科学与技术创新实验室,福州 350116

福州大学先进制造学院,福建泉州 362200

卷积神经网络 层融合 量化 多通道卷积 现场可编程门阵列

国家重点研发计划项目福建省自然科学基金项目

2021YFB36006032020J01468

2024

半导体光电
中国电子科技集团公司第四十四研究所

半导体光电

CSTPCD北大核心
影响因子:0.362
ISSN:1001-5868
年,卷(期):2024.45(3)
  • 1