A Stochastic Gradient Method with an Exponential Convergence _Rate for Finite Training Sets

Part of Advances in Neural Information Processing Systems 25 (NIPS 2012)

Bibtex Metadata Paper Supplemental

Authors

Nicolas L. Roux, Mark Schmidt, Francis R. Bach

Abstract

We propose a new stochastic gradient method for optimizing the sum of
 a finite set of smooth functions, where the sum is strongly convex.
 While standard stochastic gradient methods
 converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in order to achieve a linear convergence 
rate. In a machine learning context, numerical experiments indicate that the new algorithm can dramatically outperform standard
 algorithms, both in terms of optimizing the training error and reducing the test error quickly.