基于舍入误差的神经网络量化方法

Neural network quantization method based on round-error

郭秋丹 ¹濮约刚 ¹张启军 ¹丁传红 ¹吴栋¹

扫码查看

作者信息

1. 中国航天科工集团第二研究院七○六所,北京 100854
折叠

摘要

深度神经网络需要付出高昂的计算成本,降低神经网络推理的功耗和延迟,是将神经网络集成到对功耗和计算严格要求的边缘设备上的关键所在.针对这一点,提出一种采用舍入误差的端到端神经网络训练后量化方法,缓解神经网络量化到低比特宽时带来的精度下降问题.该方法只需采用小批量且无标注的数据进行训练,且在不同的神经网络结构上都有十分不错的表现,RegNetX-3.2GF在权重和激活数的比特宽均为4的情况下分类准确率下降不到2％.

Abstract

Deep neural networks often involve high computational costs.Reducing the power consumption and the latency of neu-ral network inference is key to integrating neural networks into edge devices with stringent power and computational require-ments.To address this,an end-to-end post-training neural network quantization method was proposed using rounding error to mitigate the accuracy degradation associated with neural network quantization to low bit widths.The method required only small and unlabeled data for training and performed very well on different neural network architectures.RegNetX-3.2GF has less than 2％degradation in classification accuracy with a bit width of 4 for both weights and activations.

关键词

模型压缩/网络蒸馏/网络量化/目标识别/感知训练量化/训练后量化/舍入误差

Key words

model compression/network distillation/network quantization/target recognition/quantization-aware training/post-training quantization/round-error

引用本文复制引用

出版年

2024

计算机工程与设计

中国航天科工集团二院706所

计算机工程与设计

CSTPCD北大核心

影响因子：0.617

ISSN：1000-7024

段落导航