Divergence of the ADAM algorithm with fixed-stepsize: a (very) simple example

Research output: Working paperDiscussion paper

13 Downloads (Pure)

Abstract

A very simple unidimensional function with Lipschitz continuous gradient is constructed such that the ADAM algorithm with constant stepsize, started from the origin, diverges when applied to minimize this function in the absence of noise on the gradient. Divergence occurs irrespective of the choice of the method parameters.
Original languageEnglish
PublisherArxiv
Number of pages3
Volume2308.00720
Publication statusPublished - Aug 2023

Fingerprint

Dive into the research topics of 'Divergence of the ADAM algorithm with fixed-stepsize: a (very) simple example'. Together they form a unique fingerprint.

Cite this