首页|DMAdam: Dual averaging enhanced adaptive gradient method for deep neural networks

DMAdam: Dual averaging enhanced adaptive gradient method for deep neural networks

扫码查看
© 2024 Elsevier B.V.Deep neural networks (DNNs) have achieved remarkable success in a wide range of fields, largely due to their stable and efficient optimizers. We propose a novel optimizer called Dual Momentum Adam (DMAdam), which combines the stability of dual averaging with the efficiency of adaptive gradient techniques. DMAdam adaptively tunes the learning rate and employs dual averaging updates, effectively balancing stability and convergence rate. This strategy enhances the control of DMAdam over gradient updates, resulting in superior performance in a variety of optimization tasks. Theoretically, we investigate the convergence properties of DMAdam for non-convex models and obtain the non-ergodic convergence of its gradient sequence. Numerically, we demonstrate the impressive performance of DMAdam on CIFAR-10 and CIFAR-100 datasets for image classification tasks. Additionally, DMAdam shows robust performance in natural language processing and object detection tasks. The PyTorch code of DMAdam is available at: https://github.com/Wenhan-Jiang/DMAdam.git.

Adaptive gradient methodConvergenceDeep neural networksDMAdamDual-averaging method

Jiang W.、Xu D.、Liu J.、Zhang N.

展开 >

Key Laboratory for Applied Statistics of MOE School of Mathematics and Statistics Northeast Normal University

Department of Mathematics Changchun Normal University

Mathematics and Information Science Wenzhou University

2025

Knowledge-based systems

Knowledge-based systems

SCI
ISSN:0950-7051
年,卷(期):2025.309(Jan.30)
  • 41