Optimal threshold probability in undiscounted Markov decision processes with a target set

作者:

Highlights:

摘要

We consider risk minimizing problems in undiscounted Markov decisions processes with a target set. We formulate the problem as an infinite horizon case with a recurrent class. We show that an optimal value function is a unique solution to an optimality equation and there exists an stationary optimal policy. Also we give several value iteration methods and a policy improvement method.

论文关键词:Markov decision process,Minimizing risk model,Existence of optimal policy,Value iteration,Policy improvement method

论文评审过程:Available online 3 April 2003.

论文官网地址:https://doi.org/10.1016/S0096-3003(03)00158-9