Bounded-parameter Markov decision processes

作者:

摘要

In this paper, we introduce the notion of a bounded-parameter Markov decision process (BMDP) as a generalization of the familiar exact MDP. A bounded-parameter MDP is a set of exact MDPs specified by giving upper and lower bounds on transition probabilities and rewards (all the MDPs in the set share the same state and action space). BMDPs form an efficiently solvable special case of the already known class of MDPs with imprecise parameters (MDPIPs). Bounded-parameter MDPs can be used to represent variation or uncertainty concerning the parameters of sequential decision problems in cases where no prior probabilities on the parameter values are available. Bounded-parameter MDPs can also be used in aggregation schemes to represent the variation in the transition probabilities for different base states aggregated together in the same aggregate state.

论文关键词:Decision-theoretic planning,Planning under uncertainty,Approximate planning,Markov decision processes

论文评审过程:Received 28 May 1999, Revised 22 May 2000, Available online 6 October 2000.

论文官网地址:https://doi.org/10.1016/S0004-3702(00)00047-3