I will discuss three problems that distill different aspects of the challenge of estimation and learning from non-i.i.d. data. The first portion of the talk will focus on the challenge of making accurate predictions despite data coming from complex sampling processes. Here we introduce a framework and accompanying algorithm that leverages knowledge of the data collection process, but makes no distributional assumptions on the underlying data values. In the second portion of the talk, I'll introduce the problem of "selective prediction/learning", where the learner observes a sequence of arbitrary (and even adversarial) data, and is asked to make an accurate prediction about future data. Surprisingly, provided the learner can select both when they will make their prediction, and the duration over which their prediction will hold, prediction with subconstant accuracy is possible. Finally, I will discuss the problem of producing "calibrated" predictions in the online binary prediction setting. This is one of the cleanest problems in sequential prediction that is still largely open. This talk is based on joint works with Mingda Qiao, Justin Chen, and Paul Valiant.
Bio: Gregory Valiant is an Associate Professor in Stanford's Computer Science Department, working at the intersection of Algorithms, Machine Learning, Statistics, and Information Theory. One of the main themes in his work is the design of efficient algorithms for accurately inferring information about complex distributions, given limited amounts of data, or limits on other resources such as the computation time, available memory or communication, or the quality of the available data.