Tuning the step size of stochastic gradient descent is tedious and error prone. This has motivated the development of methods that automatically adapt the step size using readily available information. In this talk I will present the family of SPS (Stochastic gradient with a Polyak Stepsize) adaptive methods. These are methods that make use of gradient and loss value at the sampled points to adaptively adjust the step size. I will show that SPS and its recent variants are methods that explicitly exploit models that interpolate, or are close to interpolating, the training data. I will also draw some new parallels between SPS methods and the Passive-Aggressive methods. I will then use this insight to develop new variants of the SPS method that are better suited to nonlinear models such as DNNs.
Bio: Robert M. Gower is a Researcher Scientist at the Flatiron Institute, and prior to that held visiting scientist positions at Google Brain (2021) and Facebook AI Research (2020) in New York, and parallel to that was an Assistant Professor at Télécom Paris from 2017 to 2021. He is interested in designing and analyzing new algorithms for solving optimization problems in machine learning and scientific computing. During his Phd at the University of Edinburgh he proposed the sketch-and-project methods for solving linear systems, for which he received the 2nd place in the 2017 Leslie Fox prize in numerical analysis. His academic studies started with a Bachelors and a Masters degree in applied mathematics at the state University of Campinas (Brazil), where he designed the current state-of-the-art algorithms for automatically calculating high order derivatives using back-propagation.