基于Zynq平台的低功耗人脸检测加速系统

Low-power Face Detection Acceleration System Based on the Zynq Platform

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：基于CPU及GPU的卷积神经网络平台存在体积大、能耗高等问题,提出了一种基于Zynq平台的卷积神经网络人脸检测加速系统.该系统采用YOLOv3-Tiny算法,并利用Wider Face人脸数据集进行训练.为提高网络效率,采用层融合技术减小网络深度,加快检测速度;同时,采用8位整数量化策略,以降低内存访问量,减少资源消耗.通过利用ZynqXC7Z035芯片上FPGA端并行计算能力,设计出可重复利用的多通道卷积计算模块,实现DSP的重复递用.实验结果显示,所设计的加速系统实现了 9.5 FPS的实时推理速度,检测速度是intel i7-8700CPU的7.9倍,系统功耗仅为2.65 W,满足低功耗的性能需求.

外文摘要：To address the issues of large size and high power consumption in CPU-and GPU-based convolutional neural network platforms,we designed and implemented a convolutional neural network-assisted face detection acceleration system based on the Zynq platform in this study.We adopted the YOLOv3-Tiny algorithm for the proposed system and used the WIDER FACE dataset for training.To improve the network efficiency,we utilized a layer-fusion technique for reducing the network depth and accelerating detection.Moreover,we employed an 8-bit integer quantization strategy to minimize memory access and resource consumption.We designed a reusable multichannel convolution computation module by leveraging the parallel computing capability of field-programmable gate arrays(FPGAs)on the ZynqXC7Z035 chip to reuse the digital signal processor(DSP).The experimental results showed that our designed acceleration system,which could achieve a real-time inference speed of 9.5 FPS,was 7.9 times faster than intel i7-8700CPU and consumed only 2.65 W of power,satisfying the performance requirement of low power consumption.

外文关键词：

convolutional neural networklayer fusionquantizationmultichannel convolutionFPGA

作者：

赵民、徐胜、韩路宇、林志贤

展开 >

作者单位：

福州大学物理与信息工程学院,福州 350116

中国福建光电信息科学与技术创新实验室,福州 350116

福州大学先进制造学院,福建泉州 362200

关键词：

卷积神经网络层融合量化多通道卷积现场可编程门阵列

基金：

国家重点研发计划项目福建省自然科学基金项目

项目编号：

2021YFB36006032020J01468

出版年：

2024

DOI：

10.16818/j.issn1001-5868.2023122501

半导体光电

中国电子科技集团公司第四十四研究所

半导体光电

CSTPCD北大核心

影响因子：0.362

ISSN：1001-5868

年,卷(期)：2024.45(3)

参考文献量1