一种改进的带有动量的随机梯度下降优化算法
An Improved Stochastic Gradient Descent Optimization Algorithm with Momentum
黄建勇 1周跃进1
作者信息
摘要
带有动量的随机梯度下降(SGDM)优化算法是目前卷积神经网络(CNNs)训练中最常用的算法之一.然而,随着神经网络模型的复杂化,利用SGDM算法去训练神经网络模型所需时间越来越长,因此,改进SGDM算法的收敛性能是十分必要的.在SGDM算法的基础上,提出了一种新算法SGDMNorm.新算法利用历史迭代的梯度范数对梯度进行校正,在一定程度上提高了 SGDM算法的收敛速度.从收敛性的角度对该算法进行分析,证明了 SGDMNorm算法具有O(√T)悔界.通过数值模拟实验和CIFAR-10图片分类应用,表明SGDMNorm算法收敛速度比SGDM算法更快.
Abstract
Stochastic gradient descent optimization algorithm with momentum(SGDM)is currently one of the most com-monly used optimization algorithms for training convolutional neural networks(CNNs).However,as neural network models become more complex,the time which needs to train them by using the SGDM algorithm also increases.Therefore,it is very necessary to improve the convergence performance of SGDM algorithm.In this paper,we propose a new algorithm called SGDMNorm based on the SGDM algorithm.The new algorithm corrects gradients by using the gradient norm of historical it-erations,which improves the convergence speed of the SGDM algorithm.Then,we analyze the algorithm from the perspective of convergence and prove that the SGDMNorm algorithm has a regret bound of O(√T).Finally,the numerical simulations and CIFAR-10 image classification applications demonstrate that the SGDMNorm algorithm converges faster than the SGDM algorithm.
关键词
梯度下降算法/神经网络/梯度范数/分类Key words
gradient descent algorithm/neural networks/gradient norm/classification引用本文复制引用
出版年
2024