A class of algorithms for unconstrained nonconvex optimization is considered
where the value of the objective function is never computed. The class
contains a deterministic version of the first-order Adagrad method typically
used for minimization of noisy function, but also allows the use of
second-order information when available.
The rate of convergence of methods in the class is analyzed and is shown to be
identical to that known for first-order optimization methods using both
function and gradients values. The result is essentially sharp and improves on
previously known complexity bounds (in the stochastic context) by Défossez
et al. (2020) and Gratton et al. (2022). A new class of
methods is designed, for which a slightly worse and essentially sharp
complexity result holds.
Limited numerical experiments show that the new methods' performance may be
comparable to that of standard steepest descent, despite using significantly less
information, and that this performance is relatively insensitive to noise.
|Etat de la publication||Publié - 7 mars 2022|