Acceleration and stability of the stochastic proximal point algorithm


Stochastic gradient descent (SGD) has emerged as the de-facto method for solving (unconstrained) stochastic optimization problem. However, it suffers from two fundamental limitations$:$ $(i)$ slow convergence due to the inaccurate gradient approximation, and $(ii)$ numerical instability, especially with respect to the step-size. To improve the slow convergence, accelerated variants such as stochastic gradient descent with momentum (SGDM) has been studied; however, the interference of gradient noise and momentum can aggravate the numerical instability. Proximal point methods, on the other hand, have gained much attention due to their numerical stability. Their stochastic accelerated variants though have received limited attention. To bridge this gap, we propose the stochastic proximal point algorithm with momentum (SPPAM), and study its convergence and stability.

Workshop on Optimization for Machine Learning, NeurIPS 2021 (Spotlight)