首页|神经网络压缩联合优化方法的研究综述

神经网络压缩联合优化方法的研究综述

扫码查看
随着人工智能应用的实时性、隐私性和安全性需求增大,在边缘计算平台上部署高性能的神经网络成为研究热点.由于常见的边缘计算平台在存储、算力、功耗上均存在限制,因此深度神经网络的端侧部署仍然是一个巨大的挑战.目前,克服上述挑战的一个思路是对现有的神经网络压缩以适配设备部署条件.现阶段常用的模型压缩算法有剪枝、量化、知识蒸馏,多种方法优势互补同时联合压缩可实现更好的压缩加速效果,正成为研究的热点.本文首先对常用的模型压缩算法进行简要概述,然后总结了"知识蒸馏+剪枝"、"知识蒸馏+量化"和"剪枝+量化"3 种常见的联合压缩算法,重点分析论述了联合压缩的基本思想和方法,最后提出了神经网络压缩联合优化方法未来的重点发展方向.
An overview of the joint optimization method for neural network compression
With the increasing demand for real-time,privacy and security of AI applications,deploying high-perform-ance neural network on an edge computing platform has become a research hotspot.Since common edge computing platforms have limitations in storage,computing power,and power consumption,the edge deployment of deep neural networks is still a huge challenge.Currently,one method to overcome the challenges is to compress the existing neural network to adapt to the device deployment conditions.The commonly used model compression algorithms include prun-ing,quantization,and knowledge distillation.By taking advantage of complementary multiple methods,the combined compression can achieve better compression acceleration effect,which is becoming a hot spot in research.This paper first makes a brief overview of the commonly used model compression algorithms,and then summarizes three com-monly used joint compression algorithms:"knowledge distillation + pruning","knowledge distillation + quantification"and"pruning + quantification",focusing on the analysis and discussion of basic ideas and methods of joint compression.Finally,the future key development direction of the neural network compression joint optimization method is put forward.

neural networkcompressionpruningquantizationknowledge distillationmodel compressiondeep learning

宁欣、赵文尧、宗易昕、张玉贵、陈灏、周琦、马骏骁

展开 >

中国科学院 半导体研究所, 北京 100083

合肥工业大学 微电子学院, 安徽 合肥 230009

中国科学院 前沿科学与教育局, 北京 100864

南开大学 人工智能学院, 天津 300071

展开 >

神经网络 压缩 剪枝 量化 知识蒸馏 模型压缩 深度学习

国家自然科学基金北京市自然科学基金

62373343L233036

2024

智能系统学报
中国人工智能学会 哈尔滨工程大学

智能系统学报

CSTPCD北大核心
影响因子:0.672
ISSN:1673-4785
年,卷(期):2024.19(1)
  • 1
  • 93