An Adagrad-inspired class of algorithms for smooth unconstrained optimization
is presented in which the objective function is never evaluated and yet the
gradient norms decrease at least as fast as lO(1/\sqrt{k+1}) while
second-order optimality measures converge to zero at least as fast as
O(1/(k+1)^{1/3}). This latter rate of convergence is shown to be
essentially sharp and is identical to that known for more standard algorithms
(like trust-region or adaptive-regularization methods) using both function and
derivatives' evaluations. A related ``divergent stepsize'' method is also
described, whose essentially sharp rate of convergence is slighly inferior. It
is finally discussed how to obtain weaker second-order optimality guarantees
at a (much) reduced computional cost.
Original language | English |
---|
Publication status | Published - 7 Mar 2022 |
---|