基于量化的深度神经网络优化研究综述

REVIEW OF QUANTIZATION-BASED DEEP NEURAL NETWORK OPTIMIZATION RESEARCH

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：随着模型参数和计算资源需求的不断增长,将模型部署在资源有限的设备上成为一个巨大的挑战.为解决这一挑战,量化成为了一种主要的方法,通过减少深度神经网络模型参数和中间过程特征图的位宽,可以对深度神经网络进行压缩和加速.文章全面回顾了基于量化的深度神经网络优化的工作原理.首先,讨论了常见的量化方法及其研究进展,并分析了各种量化方法之间的相似性、差异性以及各自的优缺点.其次,进一步探讨了分层量化、分组量化和通道量化等不同的量化粒度.最后,分析了训练与量化之间的相互关系,并讨论了当前研究所取得的成果和面临的挑战,旨在为未来深度神经网络量化研究提供理论基础.

外文摘要：The escalating sizes of model parameters and demands of computational resources pose significant challenges for deploying these models on resource-constrained devices.To tackle this challenge,quantization has emerged as a prominent technique enabling the compression and acceleration of deep neural networks by reducing the bit-width of model parameters and intermediate feature maps.This article presents a comprehensive review of the workings of quantization-based optimization for deep neural networks.First,common quantization methods and their research progress are discussed,and their similarities,differences,as well as the advantages and disadvantages are analyzed.Subsequently,it delves into different quantization granularities such as hierarchical quantization,group quantization,and channel quantization.Lastly,it analyzes the relationship between training and quantization,discusses the outcomes and obstacles in current research,with the aim of laying a theoretical groundwork for future studies on deep neural network quantization.

外文关键词：

deep neural networkmodel quantizationquantization-aware trainingpost-training quantization

作者：

咸聪慧、王天一、李超、吕越、孙建德

展开 >

作者单位：

山东师范大学信息科学与工程学院,250358,济南

山东鲁软数字科技有限公司,250001,济南

关键词：

深度神经网络模型量化量化感知训练离线量化

出版年：

2024

DOI：

10.3969/j.issn.1001-4748.2024.01.002

山东师范大学学报(自然科学版)

山东师范大学

山东师范大学学报(自然科学版)

影响因子：0.145

ISSN：1001-4748

年,卷(期)：2024.39(1)