Abstract: Consider a very large graph---say, the link graph of a large social network. Now invent a randomized algorithm that extracts a smaller subgraph. If we use the subgraph as sample data and perform statistical analysis on this sample, what can we learn about the underlying network? Clearly, that should depend on the algorithm. We approach the problem by considering what distributional symmetries are satisfied by the algorithm. There is a specific algorithm for which the induced symmetry is precisely exchangeability. In this case, the appropriate statistical models are so-called graphon models, but things change drastically if seemingly minor modifications are made to the subsampler. I will discuss two types of results: (1) How symmetry properties explain what we can learn from a single sample. (2) Convergence properties of symmetric random variables: Laws of large numbers, central limit theorems and Berry-Esseen type bounds, which hold whether or not the symmetry property is derived from subsampling.
S. S. Wilks Memorial Seminar in Statistics