To enable valid statistical inference in prediction tasks, we show how to generate set-valued predictions with black-box models that control various notions of statistical error. Our approach guarantees that the expected loss on future test points falls below a user-specified level, for any predictive model and underlying distribution. Building on conformal prediction, we use a holdout set to calibrate the size of the prediction sets, generalizing the approach to control error notions such as the false rejection rate. We demonstrate our procedure in four large-scale problems: (1) multi-label classification, where each observation has multiple associated labels; (2) classification problems where the labels have a hierarchical structure; (3) image segmentation, where we wish to predict a set of pixels containing an object of interest; and (4) protein structure prediction.