TY - UNPB
T1 - Multilevel Objective-Function-Free Optimization with an Application to Neural Networks Training
AU - Gratton, Serge
AU - Kopanicakova, Alena
AU - TOINT, Philippe
PY - 2023/2/15
Y1 - 2023/2/15
N2 - An adaptive regularization algorithm for unconstrained nonconvexoptimization is presented in which the objective function is neverevaluated, but only derivatives are used. This algorithm belongs tothe class of adaptive regularization methods, for which optimalworst-case complexity results are known for the standard frameworkwhere the objective function is evaluated. It is shown in this paperthat these excellent complexity bounds are also valid for the newalgorithm, despite the fact that significantly less information isused. In particular, it is shown that, if derivatives of degree one top are used, the algorithm will find an epsilon_1-approximatefirst-order minimizer in at most O(\epsilon_1^{-(p+1)/p})iterations, and an (epsilon_1,epsilon_2)-approximate second-orderminimizer in at most O(max(epsilon_1^{-(p+1)/p},epsilon_2^{-(p+1)/(p-1)}))iterations. As a special case, the new algorithm using first and secondderivatives, when applied to functions with Lipschitz continuous Hessian,will find an iterate x_k in at most O(epsilon_1^{-3/2})$ iterations.
AB - An adaptive regularization algorithm for unconstrained nonconvexoptimization is presented in which the objective function is neverevaluated, but only derivatives are used. This algorithm belongs tothe class of adaptive regularization methods, for which optimalworst-case complexity results are known for the standard frameworkwhere the objective function is evaluated. It is shown in this paperthat these excellent complexity bounds are also valid for the newalgorithm, despite the fact that significantly less information isused. In particular, it is shown that, if derivatives of degree one top are used, the algorithm will find an epsilon_1-approximatefirst-order minimizer in at most O(\epsilon_1^{-(p+1)/p})iterations, and an (epsilon_1,epsilon_2)-approximate second-orderminimizer in at most O(max(epsilon_1^{-(p+1)/p},epsilon_2^{-(p+1)/(p-1)}))iterations. As a special case, the new algorithm using first and secondderivatives, when applied to functions with Lipschitz continuous Hessian,will find an iterate x_k in at most O(epsilon_1^{-3/2})$ iterations.
M3 - Working paper
VL - 2302.07049
BT - Multilevel Objective-Function-Free Optimization with an Application to Neural Networks Training
PB - Arxiv
ER -