Thackeray 427
Abstract or Additional Information
Gradient Descent is an optimization algorithm with provable convergence guarantees for smooth convex functions, but it can be quite slow owing to its simplicity. Introducing an additional momentum step to the algorithm leads to an accelerated convergence rate. These algorithms are widely used in modern machine learning for training neural networks. However, computing the exact gradient is infeasible in this age of big data and huge models, and only a stochastic approximation of the gradient is feasible. We study the convergence of these algorithms in the setting of noisy gradients. Further, we introduce an accelerated gradient descent algorithm (AGNES) that provably achieves an accelerated rate of convergence no matter how noisy the gradients are. No prior knowledge of machine learning or optimization is required to follow this talk.