Details
For the discrete-time setup, we will first study robustness to finite approximations for models with standard Borel spaces and present conditions under which finite models obtained through quantization of the state and action sets can be used to construct approximately optimal policies, where only weak Feller continuity of the true model will be assumed. We will then investigate robustness to more general modeling errors and study the mismatch loss of optimal control policies designed for incorrect models applied to the true system, as the incorrect model approaches the true model under a variety of convergence criteria: We in particular show that the expected induced cost is robust under continuous weak convergence of transition kernels (with additional regularity for partially observable models). Under total variation or Wasserstein regularity, a modulus of continuity can also be established.
As an application of robustness under continuous weak convergence on model learning, (i) robustness to empirical model learning for discounted and average cost criteria is obtained; and (ii) near optimality of a quantized Q-learning algorithm which we show to be converging to an optimal solution of an approximate model is shown.
For continuous-time models, we obtain counterparts of the above on both robustness and approximations: We first present existence and discrete-time approximation results for finite horizon and infinite-horizon discounted/ergodic optimal control problems for controlled diffusions. Under a convergence criterion on the models, we show that the error due to mismatch decreases to zero as the incorrect model approaches the true model. Discrete-time approximations under several criteria and information structures will then be obtained via a unified probabilistic approach. We finally present a robustness result for controlled stochastic differential equations driven by approximations of the Brownian motion.
[Joint work with Ali D. Kara, Somnath Pradhan, Zachary Selk, Naci Saldi, and Tamas Linder].