首页|一种基于稀疏优化和Nesterov动量策略的模型剪枝算法

一种基于稀疏优化和Nesterov动量策略的模型剪枝算法

扫码查看
随着深度学习快速发展,模型的参数量和计算复杂度爆炸式增长,在移动终端上部署面临挑战,模型剪枝成为深度学习模型落地应用的关键.目前,基于正则化的剪枝方法通常采用L2正则化并结合基于数量级的重要性标准,是一种经验性的方法,缺乏理论依据,精度难以保证.受Proximal梯度方法求解稀疏优化问题的启发,本文提出一种能够在深度神经网络上直接产生稀疏解的Prox-NAG优化方法,并设计了与之配套的迭代剪枝算法.该方法基于L1正则化,利用Nesterov动量求解优化问题,克服了原有正则化剪枝方法对L2正则化和数量级标准的依赖,是稀疏优化从传统机器学习向深度学习的自然推广.在CIFAR10数据集上对ResNet系列模型进行剪枝实验,实验结果证明Prox-NAG剪枝算法较原有剪枝算法性能有所提升.
Model Pruning Algorithm Based on Sparse Optimization and Nesterov Momentum Strategy
With the rapid development of deep learning,the number of parameters and computational complexity of models have exploded,which pose challenges for deployment on mobile terminals.Model pruning has become the key to the implementation and application of deep learning models.At present,the pruning method based on regularization usually adopts L2 regularization combined with the importance standard based on the order of magnitude.It is an empirical method lacking theoretical basis,and its accuracy is difficult to guarantee.Inspired by the Proximal gradient method for solving sparse optimization problems,we propose a Prox-NAG optimization method that can directly generate sparse solutions on deep neural networks and a corresponding iterative pruning algorithm is designed.This method is based on L1 regularization and uses Nesterov momentum to solve the optimization problem.It overcomes the dependence of the original regularization pruning method on L2 regularization and order of magnitude standards,and is a natural extension of sparse optimization from traditional machine learning to deep learning.Pruning experiments are conducted on the ResNet series models on the CIFAR10 dataset,and the results show that the Prox-NAG pruning algorithm has improved its performance compared to the original pruning algorithm.

sparseoptimizationpruning algorithmProximal gradient methodNesterov accelerated gradient(NAG)

周强、陈军、鲍蕾、陶卿

展开 >

陆军炮兵防空兵学院信息工程系,合肥 230031

稀疏 优化 剪枝算法 Proximal梯度方法 Nesterov加速梯度(Nesterov accelerated gradient,NAG)

国家自然科学基金

62076252

2024

数据采集与处理
中国电子学会 中国仪器仪表学会信号处理学会 中国仪器仪表学会中国物理学会微弱信号检测学会 南京航空航天大学

数据采集与处理

CSTPCD北大核心
影响因子:0.679
ISSN:1004-9037
年,卷(期):2024.39(3)
  • 2