一种改进的带有动量的随机梯度下降优化算法

An Improved Stochastic Gradient Descent Optimization Algorithm with Momentum

黄建勇 ¹周跃进¹

扫码查看

作者信息

1. 安徽理工大学,安徽淮南 232001
折叠

摘要

带有动量的随机梯度下降(SGDM)优化算法是目前卷积神经网络(CNNs)训练中最常用的算法之一.然而,随着神经网络模型的复杂化,利用SGDM算法去训练神经网络模型所需时间越来越长,因此,改进SGDM算法的收敛性能是十分必要的.在SGDM算法的基础上,提出了一种新算法SGDMNorm.新算法利用历史迭代的梯度范数对梯度进行校正,在一定程度上提高了 SGDM算法的收敛速度.从收敛性的角度对该算法进行分析,证明了 SGDMNorm算法具有O(√T)悔界.通过数值模拟实验和CIFAR-10图片分类应用,表明SGDMNorm算法收敛速度比SGDM算法更快.

Abstract

Stochastic gradient descent optimization algorithm with momentum(SGDM)is currently one of the most com-monly used optimization algorithms for training convolutional neural networks(CNNs).However,as neural network models become more complex,the time which needs to train them by using the SGDM algorithm also increases.Therefore,it is very necessary to improve the convergence performance of SGDM algorithm.In this paper,we propose a new algorithm called SGDMNorm based on the SGDM algorithm.The new algorithm corrects gradients by using the gradient norm of historical it-erations,which improves the convergence speed of the SGDM algorithm.Then,we analyze the algorithm from the perspective of convergence and prove that the SGDMNorm algorithm has a regret bound of O(√T).Finally,the numerical simulations and CIFAR-10 image classification applications demonstrate that the SGDMNorm algorithm converges faster than the SGDM algorithm.

关键词

梯度下降算法/神经网络/梯度范数/分类

Key words

gradient descent algorithm/neural networks/gradient norm/classification

引用本文复制引用

出版年

2024

廊坊师范学院学报(自然科学版)

廊坊师范学院

廊坊师范学院学报(自然科学版)

影响因子：0.215

ISSN：1674-3229

段落导航